Chatbots are not a complete novelty but their practical application in business is still in relatively early stages of its existence. Most businesses and marketers are still learning how to measure chatbot success efficiently and accurately… Thus, with no reference to compare with, chatbot metrics are still a fuzzy territory.
Sure enough, chatbot tech has been compared to live chat, emails, phone calls… nonetheless, these mysterious virtual beasts remain a unique phenomenon.
The tales of their success are truly tempting but thinking you can just build a chatbot and throw it to the wolves without any follow up is deeply flawed. Like any other strategy, chatbots too need your care and attention… without continual assessment, they will hardly prove to be of any use beyond entertainment.
Therefore, with this article, I hope to help you identify the key chatbot performance metrics to track. Sure, KPIs are likely to differ depending on a bot use case as well as the choice of your channel. But my goal is to offer you some inspiration that will help you move forward and make your chatbot the powerful marketing/sales/support tool it can be!
So, Chatbot Metrics…
Just a little heads up! I’ll kick off with the crowd-favorite AND number one on the list of AARRR Pirate Metrics – the Acquisition Rate…
Well, when it comes to chatbots, acquiring a user in itself doesn’t really tell you anything about the success of your bot. That is unless we are talking about a designated lead generation bot whose sole purpose of existence is, in fact, collecting leads… BUT there is more helpful KPI for that a few paragraphs down!
The reason I don’t want to highlight acquisition is that when it comes to bots residing on Facebook Messenger, WhatsApp or any other messaging app, user interactions are conditioned by opting-in with their personal details. And so, their acquisition tells you precisely nothing about your bot success.
What I am trying to say is, before you can effectively measure the performance of your chatbot, you must first ensure people are indeed using it.
Therefore, a much more useful chatbot KPI is the Activation Rate.
The activation rate is all about capturing the number of users who venture beyond the initial acquisition stage and engage in one more task (conversational interchange) that brings them closer to the goal the bot was designed to fulfill.
Hence, an Active User would be a user who had read the initial message and engaged with it by providing a response that opens the door to further (inter)action.
However, by activation, I don’t simply mean replying “Hello”.
For instance, imagine opting in for a chatbot of a news magazine. It welcomes you and offers you to indicate the kind of news you are most interested in. Let’s say you select Business & Tech.
Your Activation metric, in this case, can be “News Preference Selected”. You have indicated interested in what the bot has to offer and made a step towards the goal completion – traffic to website, (dark) social shares, subscription, etc.
It’s not a bottom-line metric for your business but it’s one of the key chatbot metrics that are the first to indicate the bot is stirring up interest.
To make your Activation metric count, don’t be afraid to be specific. If it helps you improve, you can also differentiate between a new user and returning user activation events.
This one is a no-brainer.
Measure the interactions sent and received between the users and your chatbot.
This chatbot metric is one to watch as it can give you a good idea of its ability to engage in a decent conversation.
With rule-based bots, this metric is fairly straightforward. The number of interactions indicates precisely how far dow the conversational decision tree the user made. Therefore, assessing whether the connotation of this KPI is positive or negative is not exactly a problem.
However, when it comes to NLP bots, it’s not THAT simple.
The ideal length of an NLP bot session tends to vary greatly depending on the use case as well as the context of the conversation. For instance, very short sessions can very often indicate some sort of failure… or be the sign of a quick inquiry resolution.
Interactions per user are key metrics that show you people are actually engaging with your bot beyond a simple “Hi”.
My advice is to observe your user-bot back-and-forth exchanges to identify the average number of interactions lead to successful goal completion. Ideally, a bot should be as efficient as possible.
Much more exciting than user interaction is the volunteer users metric.
Volunteer engagement measures the number of users who decided to interact with your chatbot without any prompt. Therefore, typically, this is a great indication of your conversational marketing efforts. The users come to your site or engage with the bot on WhatsApp because they’ve heard of your bot. They recognize its value and are interested in what it has to offer!
Unrpomted interactions are one of the best ways to measure actional interest and engagement, so no matter what your bot does, this is one of the chatbot metrics that shouldn’t miss on your list.
Keeping track of sessions is also a good way to asses your bot’s popularity or usefulness in a given use case. You can look at bot sessions from two different perspectives. Depending on your needs ou can measure average daily/weekly/monthly number of:
- Sessions per user
- Chats handled by the bot
Measuring sessions per user is different from keeping tabs on “Interactions per user”. Instead of focusing on the number of exchanges you focus on the number of conversations a user or users have with your bot during an allocated period of time.
Once again, it depends on the chatbot use case.
For instance, if you offer an educational chatbot to existing and potential language students, noting average daily sessions per user will be a good indicator of the quality and value of your content.
No list of KPIs would be complete without mentioning the one and only retention rate.
Typically, the retention rate is represented by the percentage of users who return to interact with chatbot within a specific period of time.
This is a metric you should keep an eye on YES or YES identifying key retention periods. For instance, it’s good to observe retention after day 1, week 1, week 2, month 1, and month 3. Breaking it down will help you identify critical moments in the customer journey and adjust your strategy accordingly.
If you are after actionable insights, GSR is the key performance metric for you. And… this is that metric I was talking about that should cover lead acquisition IF acquiring leads is the actual goal of your bot!
GSR doesn’t look at how many users engage with your bot. It tracks the percentage of those users who, in fact, reach the goal you designed the bot to accomplish. It can be anything from making a sale to resolving a customer query, booking an appointment, delivering a quote or providing a key piece of information.
One of the few chatbot KPIs that has a direct effect on your bottom line.
It’s the conversation rate of chatbot performance metrics. If your goal completion rate is low, it doesn’t really matter how many customers engage, you can’t really consider the bot a success.
If you are using human takeover, you can differentiate even further. Consider tracking both the percentage of goals fully completed by the bot versus the percentage of successful handovers to human agents. Here, I want to stress the word SUCCESSFUL. Sometimes bot-human handoffs can happen when the bot FAILS to achieve a goal… Well, keep reading!
This chatbot KPI concerns the proud owners of NLP-based bots. Rule-based bots can be pretty complex and hyper-personalized. However, the interactions are always structured and so there is not much space for the conversation to go haywire.
When it comes to the Natural Language Processing chatbots, perfection is simply unattainable, no matter how well-trained they are. If artificial intelligence was good enough to perfectly mimic human beings, they would already be ruling over us in a good old sci-fi apocalypse fashion.
You can capture chatbot’s fallback rate in different ways. In fact, dividing the fallbacks into categories will give you a deeper insight into what went wrong and what you can do to prevent it in the future.
A. Confusion Rate
When the bot fails to understand user input, it tends to get confused and falls back on the familiar “Sorry, I didn’t understand. Can you please rephrase the question?”.
You can calculate the overall Confusion Rate by dividing the total number of messages sent by the number of times the bot resorted to a fallback answer. It’ll give you an indication of the additional training that needs to be done. Plus, you’ll get an idea about the quality of the customer experience (or lack thereof).
Next, there are times when users get all confused and tangled up in a chain of responses wishing they types different requests or submitted different info. When this happened, they usually RESET the conversation.
There are two ways to go about it:
- Your bot can have a “reset” button available as part of its conversational interface
- You set up a recalibration procedure. For instance, after two fails, your bot sends a different kind of message clearly explaining what it can and cannot do to help navigate the user in the right direction.
We already talked about human takeover. Well, the more successful instance of it.
Indeed, a human agent takeover can be a result of your bot successfully identifying the type of user or situation where transferring the conversation to a human agent is actually part of its job.
You can also design human takeover as a failsafe for instances the bot failed to identify user intent too many times and you wish to save the situation…
So the third aspect of a chatbot FBR metric you might consider tracking (if relevant) is the percentage of fallback that results in human takeover.
Using a bot to provide customer support or additional service?
It’s a good idea to measure its impact on overall customer/user satisfaction.
The self-service aspect of the chatbot technology means your access to the end-user experience is somewhat limited. You don’t know how a customer feels or what they think about the conversation that transpired between them and the bot.
How you measure customer satisfaction depends on your bot.
The most straightforward of all techniques… After the conversation has come to a close your bot itself can ask the user to rate its performance there and then.
Alternatively, you can also calculate your customer service rate by considering the optimal time frame of query resolution, percentage of confusion triggers and – if available – sentiment analysis.
NLP bot owners should also seriously consider tracking metrics of user intents.
AI-fueled bots are not a simple thing to control. Keeping track of what is happening with individual intents can not only help you identify the point of friction in the user experience but also gain a deeper insight into user behavior
Your intent-based analytics can include intent(s):
- Has the most exits
- Are the activated most/least frequently
- Provokes the most/least fallback responses
- Has the most negative/positive sentiment
There is no doubt you could collect even more data than the nine suggested chatbot matrics above.
Spread your analytical wings! Consider tracking chatbot KPIs such as total sale value; customer support savings; cost of operation and maintenance; increase in NPS, etc.
Each bot is unique.
How you measure chatbot success should directly correlate with your bot use case.
What I hope for you to take away from this is the awareness that when it comes to bots numbers don’t always reflect the quality of the conversation and experience. Go beyond acquisition, go beyond engagement… make sure your bot is doing the job it was designed to do and leaves your potential/existing customers happy.