Machine Learning at HubSpot on AI at Work

Posted by Alyssa Verzino on May 9, 2019 5:16:30 PM

HubSpot-31

On Episode 38 of AI at Work, we’re joined by Vedant Misra, technical lead in machine learning at HubSpot. HubSpot makes software for small and medium businesses, rolling up the functions of Wordpress, Google Analytics, Salesforce, and other tools that businesses need to sell their products and services.

On the data side of things, HubSpot collects a rich interaction history between sellers and consumers, creating a “fantastic label for whether or not each person you were interacting with decided to buy or not to buy your offering,” describes Vedant. This translates to extremely valuable information for companies, who can use it to figure out which accounts to target.

Prior to joining HubSpot, Vedant headed a company called Kemvi. At the time of its acquisition by HubSpot, Kemvi was developing a product that would allow companies to generate customized tags of 4 to 6 words based on very minimal customer information that could be then used to send marketing materials which would garner higher opening and response rates. At HubSpot, Vedant runs the NLP team, which is focused on applying research in deep learning across different products.

Recently, Vedant describes that he and his team have zoned in on an exciting opportunity. SalesHub is a recently launched tool and has a growing number of sales managers and reps using this product to make calls. The NLP team has applied models to transcribe these calls and pull out interesting information.

“What would be exciting to managers, as you can imagine, is transparency into what reps are talking about, whether calls are succeeding,” says Vedant. “It's a pretty rich stream of information because the truth of every deal is in what the reps are saying.” 

Structuring Machine Learning Teams at HubSpot 

“HubSpot believes in having lots of small teams that each have autonomy in their mission,” says Vedant “and so the ML group consists of multiple teams that are responsible for their own components.” There’s an infrastructure team, a data team, a models team, a research team, and the NLP team. Each one owns a different domain of the problem at hand.

In terms of collaboration across different teams, Vedant describes two modes of operation: the first is a model of tight collaboration with the product-owning team, where a modeler becomes embedded into, and works with, the existing structure. The other is a SWAT team approach: “go into some else’s codebase, wire up our stuff, and leave,” says Vedant. It really depends on the kind of implementation and the product concerns that come up with each specific model as to which approach gets chosen.

In terms of recruiting new talent, he shares that HubSpot is hiring for two types of roles - the first bucket being people who already have the background and can take a more senior position in terms of figuring out where the product can go from an ML perspective. They also hire more junior people who can be trained and brought up to speed on the way ML works at HubSpot. It’s a balance of the two, Vedant says, since HubSpot is growing pretty quickly.

Taking the Pulse of the AI Ecosystem

Vedant likens 2019 for NLP to the ImageNet year for machine vision progress. He ascribes a large breakthrough to OpenAI GPT-2, the basic idea behind which is that “you can take a model, train it on a huge amount of text without giving it any specific goal besides to predict the next most likely word that's coming up given a set of words in a context window, and you just feed in a ton of words off the internet.”

Then, he says, “if you feed in the start of some text, let's say, something that looks like a Reuters news article or a blog post, it will roll out the rest of that content in a way that is grammatical, coherent, and maintains consistency throughout the body of the article about what it's talking about. It's really remarkable.”

For most companies, technology like this is still a ways off in terms of affordability, both in time and resources. However, Vedant softens this, saying “I think there are already features like this in products that we use and love every day, like Gmail. Gmail's Smart Compose is mind-blowing, and it kind of just snuck up on us.” 

In terms of scoping out future directions for entrepreneurs to explore, Vedant puts things simply: it is smart to start a company if you have an unfair advantage. “You can build an unfair advantage by collecting differentiated data because, as we know, models are generally open source, and there's not too much IP in being able to iterate on model architecture,” he says. 

His advice for entrepreneurs? Think about whether you have access to some stream of information that’s hard to get to, or if you can make use of it in a way that other people can’t, whether it is by getting differentiated labels or some other method.

The Missing Piece

In terms of pieces still missing, Vedant says that nobody has yet created a unified tool that can solve a variety of problems that each have independent point solutions. “If there was some open source, platform-independent, useful, well-documented thing for doing this, that would be very compelling,” he says.

Nobody has yet built something that can address things like “monitoring metrics for features that are coming in to models at inference time, looking at the distribution of those and if there's change, monitoring various SLAs on the actual models in terms of latency or memory usage,” outlines Vedant, “the millions of things that come up when you're working with this stuff in production.” A third-party solution that did this very well would be pretty compelling. 

How Far Will Deep Learning Take Us? 

On a closing note, Vedant ventures into philosophical territory, offering that when thinking about the question of how far deep learning can take us, most people conflate intelligence with consciousness. These are two very different things, he points out. Linear algebra with lots of data is going to be the main workhorse of this revolution, he says, but it is not going to solve consciousness. Deep learning nails replicating the things we used to think only our brains could do, and that’s intelligence. 

Moving forward, it will be interesting to keep an eye on how this philosophical debate evolves further, as it seems that there has not yet been enough discussion about the distinction between intelligence and consciousness, and where AI fits in.

Tune in to AI at Work on iTunes, Spotify, Google Play Music, Stitcher, or SoundCloud and share with your network! If you have feedback or questions, we'd love to hear from you at podcast@talla.com or tweet at us @talllainc. 

New call-to-action

Topics: AI at Work