Measuring Success in the Land of Chatbots

Posted by Juliette Kopecky on Aug 11, 2016 10:23:00 AM

Talla-Measuring-Success-Chatbots-Slack.jpeg

A couple of weeks ago, we launched Talla the Task Assistant, it’s a subset of the full Talla functionality coming later this year, and focuses on natural language management of to-do/task lists in Slack (and Hipchat soon). We’re a startup, and have been building the foundation of our larger platform over the past few months. It’s exciting to have shipped our first product, and since we’ve been so focused on that, at this point, we’ve shifted from trying to answer “What should we build?” to “Have we been successful?”

Our goal for this initial launch was to get 50 companies signed up in the first month. We’re only a few weeks in, and we’ve already signed up over 700 companies across 30,000 total users. Wow! It’s been amazing to have already surpassed our goal, but before we go around high-fiving ourselves too enthusiastically, we’re asking ourselves whether that’s the right way to measure Talla’s success.

How to measure success
Measuring success for companies is always a tricky topic: the right metrics for one company may not be the same for another. For example, before starting at Talla, I worked at HubSpot and Backupify, two B2B SaaS startups based in Boston. At HubSpot, we built software to help marketers with inbound marketing. We were very focused on customer usage as a way to measure success since it was a good indicator whether or not a customer was going to churn; if a marketer purchased HubSpot but wasn’t logging in on a daily basis to do things like build landing pages, do email marketing, look at their analytics, monitor social media, etc. then it was unlikely that they were getting value out of the product, which would lead them to cancel their subscription. At Backupify, it was the opposite. Backupify sold a cloud-to-cloud backup product so that companies could have a secure, second copy of their data in apps like Google Apps and Salesforce. Even though we were a B2B SaaS app like HubSpot, how frequently a customer logged in to our software wasn’t a relevant way to measure success. Since our product was designed as a failsafe, if a customer didn’t log in for weeks or months, that was fine as long as our product gave them what they needed when disaster struck.  

The problem with measuring interactions
Now, back to Talla. We wanted to find a metric to help us measure success based on the value we were bringing to users. We started by looking at interactions. We measured an interaction as someone talking to Talla. For example, an interaction could be if a user said, “Talla, add ‘Set up a meeting with Jan to discuss our Q4 numbers’ to my task list.” At first that seemed like an appropriate metric because the more a user was asking Talla to take care of, the more valuable she was, correct? Yes, but not completely.

One of our goals is to increase the sophistication of Talla over time so that she goes from helping you manage things like your task list to actually helping you accomplish things proactively. For example, our Task Assistant currently provides users a daily glimpse of their day first thing in the morning, providing information on items on your calendar, a rundown of your existing tasks, and the current weather. A user may view the information, and while not say anything to Talla, still extract value from the information. If we only cared about interactions, that benefit to having Talla proactively provide useful information wouldn’t get captured.

Daily/Monthly/Weekly Active Users
We also started looking at active users. We had a large number of installs which gave us a large number of users, but we wanted to go beyond just looking at installations. After someone installed our app, we wanted to see whether they would use Talla and continue to use her over time. For an app like a chatbot, it’s natural to see a drop off in activity across users after the initial install -- some users will install the app, realize it’s not what they’re looking for, and never use it again. Having a sharp drop in activity can be an indicators that something is amiss, whether that’s incorrect messaging (users expect one thing and get another), your app not functioning properly, or maybe that there are ux issues. While it’s normal to see some amount of drop off, the holy grail is to see activity pick back up after that initial drop off, causing the usage line to form a u-shaped curve or a “smile chart.” (After all, what’s better than a chart that smiles back at you?) It’s beneficial to look at this metric both over time and by cohort to see trends across your users and to measure whether changes you implement are impacting usage. Since we launched our app, we’ve been happy to see that many of our users are actively engaged with Talla on a daily basis, especially as we’ve added new functionality.

Users per organization
Another metric that we’re watching closely is the number of users per organization. Since Talla is a B2B app, we want to ensure that it’s valuable not only to individual users, but also the organization as a whole. Word of mouth is an extremely effective strategy for gaining new users, and what better way to test this than whether users were telling others within the organization. We recently added the ability to create group task lists and reminders, which was one of the biggest feature requests we had when we first launched the Task Assistant. With this feature, users have the ability to create shared task lists with their teams to manage projects. Additionally, this helps us spread Talla throughout an organization more effectively. We’re not only tracking individual installs, but also how organizations are using Talla.

 

At Talla, we’re obsessed with data and measuring everything. (When you work on a team with PhDs and data scientists, it’s hard not to be.) We’re always pushing further to gain new insights -- whether that’s listening to feedback from users, analyzing trends in usage, or studying the way Talla is interacting. As a B2B bot company, we’re not only looking at installations, but whether people continue to use our product and how users within an organization use our product as a way to measure value.

 

Image courtesy of: https://unsplash.com/@albertosaure

Topics: NLP