Speech analytics is a hot topic recently, but its long history dates back to 1952 when basic speech recognition first broke on the scene. Bell Laboratories introduced “Audrey” in 1952, a device that could recognize one voice speaking digits 0-9. In 1962, IBM unveiled their “Shoebox” device which could understand the digits plus six other words (minus, plus, subtotal, total, false, and off). But speech analytics as we know it today didn’t really come to fruition until the early 2000’s.
In the early 2000’s the global market saw a sharp uptick in call centers and call center activities, and the enterprises behind them understood that customer experience there was valuable. In order to improve the customer experience, enterprises started to invest in speech analytics as a tool to diagnose key points in customer-facing conversations that could be refined. Initially, this involved basic transcription of calls followed by tagging them into groups depending on various topics. The tags helped people to identify recurring problems and common behavioral patterns during the calls, which helped in turn to re-route the calls to the appropriate agents. This technique proved useful for training new agents as well as for improving the customer experience overall.
As the market started to see an increasing return on their investments in speech analytics, the technology itself was fueled to higher levels of sophistication. By the mid-2000’s sentiment analysis was added to the mix, allowing enterprises to monitor the emotions humans displayed during customer calls. This resulted in new insights to customer experience, which in turn allowed enterprises to prioritize areas for improvement more easily than before. The success was snowballing and fueled further evolution.
This evolution continued on a strong path throughout the 2010’s as we witnessed further revolutionary developments in the field of natural language processing. The accuracy of speech analytics grew increasingly impressive, and today it is fully entrenched in the world of customer experience, covering multiple industries and use cases from sales to fraud prevention.
As we saw with examples like Audrey and Shoebox, speech analytics as a discipline has its roots in voice recognition technology. However, these roots are distant and the two are considered today as related, but different technologies.
Whereas voice recognition (sometimes referred to as speech recognition) converts speech into text, or allows devices to interpret speech, speech analytics analyzes the speech data itself. Voice recognition allows users to interact with machines by using their voice. Some examples of such technology would be the voice assistant on your smartphone, dictation apps that you might use to take notes for you, or devices in your smart home or connected home that you activate and instruct by speaking to it. Speech analytics on the other hand entails monitoring and analysing speech to associate different patterns of behaviour in order to help companies to make modifications in their business processes with the goal of enhancing the customer experience.
The end-goal of the two technologies is therefore quite different, despite them both dealing with speech. Voice recognition is focusing on converting the speech to a new format to enhance its usability, whereas speech analytics is focusing on gaining insights to customers from conversations in order to prioritize customer experience aspects to improve business outcomes.
Read more: Where to get speech recognition data for NLP models? and The importance of data in voice assistant development
As the name suggests, speech analytics involves analyzing speech data, and this speech can come from recorded speech or live conversations and utterances. Gartner defines speech analytics as software to “enable real-time and postcontact capture and analysis of service and support experience” (Gartner ID G00781108 Gartner Market Guide for Speech Analytics Platforms 22nd March 2023).
Typically the speech analyzed comes from recorded or live conversations on the phone, from voicemails, and from other business interactions like sales calls and assorted varieties of customer service calls. Machine learning algorithms transcribe the speech and then analyze their content based on the users’ requirements. The larger the pool of data available to the algorithm, the more accurate the outcomes will be.
As well as providing a consolidated overview of the customer’s interactions with the enterprise, speech analytics can reveal relevant patterns in the topics of the conversations and the sentiments of the speakers, the frequency of certain words or phrases (known as “indexing”), and the tonality of the speakers. This allows enterprises to understand customers more deeply, and to make data-driven decisions when deducing the overall efficiency of communications. It allows them to monitor the performance of their agents, monitor compliance, experiment with new tactics, and to implement the necessary adjustments with the goal of improving the performance of voice interactions leading to the desired business outcomes.
The use cases for speech analytics are multiple and span countless industries. Here we will investigate a few of the more common use cases.
Customer service was one of the initial use cases for this technology. Speech analytics monitors agent-to-customer interactions either in real-time or using “post-call analysis”, meaning that recordings of the interaction are analyzed after the call. Enterprises use this to keep an eye on their call agents to ensure they’re performing their best, and to ensure compliance with regulations. Feedback from speech analytics can assist in exploring new ways to reduce call-handling times, and can identify specific key words or phrases to avoid or include to enhance the outcomes of the calls.
Similarly, quality assurance is a popular use case since speech analytics technology allows enterprises to evaluate the overall quality of interactions on business calls. Here too, monitoring compliance with regulations is valuable, and this allows enterprises to evaluate whether agents have followed protocol regarding company policies, internal procedures, as well as local laws such as the Health Insurance Portability and Accountability Act of 1996 (“HIPPA”). It can also be used for training and re-training agents.
Healthcare itself is a popular use case in and of its own, with speech analytics proving to be a useful tool for monitoring interactions with patients and healthcare agents, identifying patterns to enhance care procedures and outcomes, and even spotting new opportunities for improving the patient outcomes in future such cases. It has also proven popular as a tool for identifying new actions to implement in order to enhance patient satisfaction outcomes.
Security and fraud detection is another use case that is reaping the benefits of speech analytics. Speech analytics can suggest risk rules that enterprises can implement to block, allow, or flag up certain words or phrases, with the goal of identifying and blocking fraud threats before they materialize. It can analyze indicators such as keywords or speech patterns, and based on predefined triggers, systems can alert agents in live time to different levels of fraud threat. This empowers agents to handle such threats earlier in the process, reducing the risk of fraud dramatically.
When it comes to commercial use cases, sales and marketing have long been a common use case for speech analytics. Sales agents can use it to identify the most effective messaging and tactics to use during calls with customers, and marketing can pick up on the key words and phrases that lead to the highest conversion rates, so that they can reuse those in their messaging.
Finally market research is a popular use case for speech analytics as the technology can be used to gain key insights into the preferences of different markets and demographics. Their basic requirements can be identified as can the frequency of different pain points they experience. Speech analytics makes it quick and easy to handle vast amounts of data on these topics, enabling enterprises to make more accurate decisions more quickly based on the data.
As speech analytics becomes stronger across new languages and markets, we expect to see it increasingly deployed over more use cases in the near future too.
Read more: What to consider before starting an AI project in your company?
The benefits of speech analytics are tremendous and this is exactly why implementation continues to extend across new industries and use cases.
Efficiency is increased by automating the entire process from transcription to analytics of speech. This allows companies to make significant savings compared to manual analysis, and the results are far more accurate. Quality assurance is also improved, enhancing efficiency and saving money. Speech analytics enable enterprises to identify which sales tactics are the most impactful, which messages and keywords perform the best. This directly generates greater success rates and higher revenue for the business.
Performance improves greatly when speech analytic is used as a tool to monitor agent success. It can provide feedback to agents on their communication style, empowering them to identify improvement points, and in turn improving their performance overall. Compliance is a key area of performance that speech analytics improves in particular. It can be used as a reliable tool to ensure compliance with regulatory requirements, internal policies, etc, and this in turn reduces the risk of lawsuits, fines, and other legal problems.
Finally, speech analytics improves decision making. By providing insights to customer behaviour, needs, and preferences, market trends, and other key factors, speech analytics enables teams to better understand their customers and their markets, and make improved decisions accordingly. In particular, decisions around the customer experience benefit from speech analytics as it directly pinpoints customer pain points, improves call handling (eg re-routing of calls), and can allow the enterprise to identify areas where more personalized solutions might be preferable both for the customer and the business. This improves decision making efficiency and drives business growth.
Overall, speech analytics provides significant benefits but it’s important to remain aware of the potential drawbacks and to implement appropriate measures to mitigate them. Most drawbacks come down to basic common sense and would apply to technologies in general, such as concerns surrounding data privacy. Speech analytics can involve sensitive data that might arise for instance during conversations with customers. Enterprises are ultimately responsible for ensuring compliance with all relevant data privacy regulations and protection of any sensitive customer information.
The actual adoption of the technology can prove problematic, and implementation can be costly. It might entail significant investment upfront with regards to the technology as well as its supporting infrastructure. Integrations with additional systems such as CRMs or marketing automation platforms can be pricey but could be worth it since it would provide a more comprehensive overview of customer interactions. This can present technical challenges and may require additional resources. The success of the implementation also involves adoption of the technology internally, which means staff will need adequate training and support to be able to incorporate the technology smoothly to their workflows.
Following the adopting of speech analytics, its accuracy is critical for success. Depending on the accent and speech style of the speaker, many speech analytics systems fail to capture or analyze the speech accurately, causing misrepresentative results. Colloquialisms, slang, background noises, stuttering, code-switching and other common factors can impact the accuracy negatively. To mitigate this, it’s important to know for sure that your speech analysis algorithms have been trained on a high quantity of good quality training data. In their 2022 survey on the state of AI in Europe, StageZero found that 92% of companies struggle to access sufficient data for training their algorithms, so this is no small problem. Furthermore, the context of the speech is often missed entirely since the algorithms are picking up on the speech alone, and not the body language, facial expressions or other nonverbal cues, and this can also impact the reliability.
Overall, enterprises should take care not to rely too heavily on speech analytics, but rather use it as a tool. Human intuition is key, and human judgement is a crucial component of analysing customer interactions. Speech analytics can be a productive tool for providing valuable insights but does not replace the agent – not yet. However, the technology is becoming more and more advanced, and therefore there are ethical concerns to stay aware of moving forward. The use of customer data for targeting advertising is one such area regarding serious ethical consideration. The use of speech analytics has its benefits but enterprises must take steps to ensure they use it responsibly.
Based on the benefits and drawbacks, and the current market situation, most enterprises should be able to project positive business outcomes from implementing speech analytics successfully. The true value will depend on the costs of implementation, size of the company, industry and use cases, among other factors. Therefore it’s of course advisable to conduct a thorough cost-benefit analysis to determine the true value on a case-by-case basis. The unlimited potential however is clear.
If this piques your curiosity, then please get in touch to discuss your project requirements with us.
Keep up to date with the latest news from the forefront of AI! Subscribe to our newsletter and follow us on LinkedIn.