Signal Generation and Conversation State Tracking

Home

Blogs

table of content

Signal Generation and Conversation State Tracking

Text Link

Written in collaboration with : Ashish Kumar & Sunit Singh

Conversation data captured during sales calls is a rapidly growing pool of exchanges between the agent and the customer. There is tremendous scope for extracting intelligence and reusing it, after some curing, to reinforce the diverse conversation scenarios. With the incremental insights that the system harnesses from conversations, a sales agent today is able to cross the inflection point set by the age of big-data in a field which has been long driven by empirical methods, tips and guidelines. The system is incremental in nature due to its capability to learn from successful sales call instances and applying this to identify and guide scenarios where it can lead the conversation to successful closure. In all, a sales agent with the set of tools offered by the conversation intelligence system is well equipped to deliver a better, streamlined, well-informed and lucid sales experience, and is able to perform intricate post-call analysis.

The Signal Generation and Conversation State Tracking system is a post-call analysis tool designed to extract intelligence from sales conversations. Typically a successful sales call follows a loosely designed roadmap contingent on several factors like consumer profile, agent profile, products and features profile and so forth. It also has an underlying order in which it progresses. Given below is an excerpt from a sales call.

Introduction

Agent : Hello my name is Andy, am I speaking with Shubham.

Customer : Yes it’s Shubham here, how can I help you Andy

Agent : I am calling from xyz company and I wanted to take a few minutes of your time to talk about a product you showed interest in recently.

Customer : Yes, I would like to know more about abc product.

Product Feature

Agent : abc product is our flagship product and it offers one-stop retail solutions from strategizing and R&D to execution. You can expect [...]

Customer : [...]

Agent : [...]

Customer : Sounds good, however I’d like some refactoring in some aspects of the R&D solution. [...]

Solutioning

Agent : [...]

Customer : [...]

The utterances have been divided into sections ascertained by their dimensions such as Introduction, Product Feature, Solutioning etc. The sequence in which these dimensions occur during a call are a blend of some pre-established criteria, for instance a call would conventionally begin with introductions, end with scheduling follow-ups etc.; and with agent-customer interaction driven criterion. In the above excerpt, the consumer mentions his/her need to understand about abc product and thus, the conversation progresses into the Product Feature subspace.

Fig. 1 : Example of traversing conversation states in a typical sales conversation

A conversation subspace is a set of utterances that address a specific agenda during the conversation, like an itinerary of sorts. It is signified with the help of signals and dimensions. Signals are generic intents that encompass utterance groups. Dimensions are broad sets of coarse-grained agent-customer conversation intents that fulfil a major criterion of a sales call such as Introduction, Lead Qualification, Solutioning etc. Identifying and learning to model the sequence of subspaces (or dimensions and signals) that a conversation proceeds in, can set a guideline to streamline a sales call. This conversation modelling can be used to ingest successful sales calls from agents (and unsuccessful ones) to analyze the factors that lead-up to a successful conversion.

We have seen the sequencing of conversation subspaces, we can now consider a fine-grain view of the content of subspaces. Conversation states are temporal entities, in that, modelling them entails learning from time-stamped or ordered conversation structures to output win-sequences of conversation entities. The content of the conversation subspaces is a spatial problem which depends on factors such as customer profile, product profile, agent profile. This is done by applying a pipeline of cascaded natural language understanding(NLU) models to gain semantic understanding from sales conversations.

Fig. 2 : Salesken’s conversation flow tool tracks the agent-customer conversations and highlights the intents caught(in the form of dimensions) at various parts of the discussion.

Product Profiling

For instance the Pricing subspace for a product of a telecommunications company would encompass a different set of guidelines than that of, say an educational-tool company.

Here product profiling would help emphasize what to address given the context.

Customer Profiling

Identifying consumer behavior can be an important aid in a sales call. This can be learnt from the past conversations with the particular customer. Lead qualification for an educational-tool company would require a pitch to be closer to the needs of children and tutoring more efficiently, as against that for a bank where the agenda would be closer to highlighting loan-schemes to customers in a more attractive manner.

Agent Profiling

Certain agents may have a better success rate with some consumer sectors as compared to others. Ascertaining agent-win sectors and using this intelligence to assign agents to focus on their strong-suit sectors can offer great increment in the overall success-rate of sales call conversions.

Fig. 3 : Profiling entities across product, customer and agent for conversation intelligence.

As we have seen, a streamlined sales conversation can be loosely mapped to a dimension state space with an underlying order of precedence of the dimension states. Signal entities defined for an organization map back to dimensions(which are cross vertical entities) and can help identify the transition changes that occur in a sales call. Dimension state transitions and the organization specific signals they encompass are defined in a playbook which serves as a guide to a streamlined agent-to-customer interaction.

There are two major tasks in abstracting the underlying state transitions in a sales conversation, namely signal discovery and their mapping to the dimensions set, and transitions between dimensions. At a high level the playbook dimensions are defined as the following,

Greetings
Introduction
Right Party Contact (am i speaking to Suvro?)
Engaging The Customer
Query Of Products
Company Intro / Product Intro
Taking Up Objections
Objection Handling
FAQs
Sharing More Information About The Product
Payment Modes
Price
Offers
Close The Call
Next Steps (follow up)
Closing

Our systems are designed to handle the signal discovery and state transition task in tandem. The model learns from the underlying patterns of successful sales conversations(calls that the agent has closed successfully) to identify the dimension state at a point during the call. This can assist tasks like cueing wherein knowledge of the dimension state in the conversation can enrich the response generation mechanism.

Typically the objective of such a model is set up so as to learn from successful conversations the cumulative probability of transitions between states which lead to a successful closure. Once the semantics of the utterances are matched to signals (given the context of the utterances in the previous turns), these signals are classified among the dimension states. The encoded context helps identify the nature of the signal at any part of the conversation, for instance a signal(or similar signals) could appear at different parts of the conversation and would therefore belong to different dimensions.

For a state transition model to work, we need to have discovered an all-encompassing set of signals. Signals are a bridge between utterances and dimension states. In terms of natural language understanding the abstraction from utterances to signals and their mapping to dimension states follows the fine-grained to coarse-grained paradigm. Signal discovery module uses a hybrid of supervised and semi-supervised approaches over the entire utterance space.

Broadly speaking, these are the major items in the signal discovery pipeline :

Utterance Embedder - We use a Siamese framework based SentenceBert model to encode utterances. This embedder can be fine-tuned on a set of utterance-to-signal similarities to enrich domain-specific utterance embeddings.
Clustering - The utterance embeddings are then clustered using Gaussian Mixture Models which assigns a signal vote or multiple signal votes to each utterance with varying probabilities. Based on the task, thresholding can be done to enlist utterances in specific clusters. Homogeneous and diverse clusters now signify utterance groups which would belong to a certain dimension state.
Task Aware Few Shot Learning Classifier - Finally a few shot learning classifier trained on a set of utterance-to-dimension correlations is applied to the cluster outputs to assign dimension states to clusters. Clusters tend to have a mixture of dimension states with the homogeneous clusters showing a clear-cut majority of a particular dimension.

Given below is a visual representation of a few cluster entities. The system establishes priority among the utterances in the cluster and ranks them. The clusters are also themselves ranked on the basis of homogeneity and diversity.

Fig. 4 : Conversation subspace clustered on the basis of utterance semantics.

The visual representation tool can be used to assist in the post-call analytic tasks. For instance, it is possible to inspect the cluster space and cherry-pick signals or define new signals and seed the cluster space with these new signals to see how the utterance space converges around them. Based on whether the seeded signal is defined from the cluster groups or introduced through some other criterion, this becomes a self-supervised task for the system to identify utterances that would be entailed in this signal group and rank them.