Thank you for Subscribing to CIO Applications Weekly Brief
Thank you for Subscribing to CIO Applications Weekly Brief
Meanwhile, though, many commercial applications requiring speech output still rely on prerecorded system prompts rather than synthetic speech. That reliance is expensive, however, and frequently requires companies to “start from scratch” for any new application. Synthetic alternatives look appealing, but despite significant advancements in speech synthesis over the last two decades, most synthetic voice offerings lack the expressiveness of human voices.
Speechmorphing, Inc. (Speechmorphing) is taking on these challenges in speech synthesis. It creates contextualized and personalized voices for branded and human-like digital customer experiences. The company’s state-of-the-art speech synthesizers can, in fact, simulate speech quite affordably and expressively. In this interview with CIO Applications, Dr. Fathy Yassa, Founder and CEO of Speechmorphing, discusses the company’s premium-quality synthesized voices – affordable whether off-the-shelf or custom-made – and their capacity to express varied styles and tones to support a wide range of clients.
Can you provide us with a brief overview of Speechmorphing and the solution it brings to the table?
Speechmorphing is a worldwide AI speech technology company headquartered in San Jose, with operations in Europe and Israel. We were founded in 2010 to revolutionize human-machine communications by making expressive custom voices widely available. There is high demand for synthetic voices that are individualized, contextualized, brandable, and expressive. Clients need the voices of their choice, quickly and affordably, and they need those voices to perform expressively – in effect, to act.
We leverage deep neural networks for natural language speech synthesis. The patented modular design of our solution and service offering gives the highest level of control at every step of voice production and message delivery. The same modularity also enables voice creation with minimal recording and with unprecedented speed. And just as important, the voices can be given the desired prosody and voice quality, thus enabling natural, human-like text-to-speech in multiple styles (like reportorial or promotional) and many tones (like empathetic or apologetic). All this while creating uniquely convincing digital twins of the original voices!
Because we can create high-quality voices affordably with quick turnarounds, we can broaden the accessibility of custom synthesized speech and speed its proliferation across all applications and companies.
Speechmorphing simplifies the deployment and management of voice-enabled applications. Users can design their response system without understanding every detail of the process.
Today’s conversational systems still lack personalization and emotional intelligence. In an era when conversational AI systems can perform sentiment analysis and emotion detection and read customers’ emotional and linguistic cues to determine their attitudes and moods, we should expect systems to respond appropriately, selecting fitting words and expressing them with natural vocal styles, tones, and attitudes. Instead, companies are overlooking the importance of branded voices, settling instead for the same off-the-shelf voices that many other companies use.
Even more, our voices are multilingual out-of-the-box! For example, a voice created in US English can be enabled to speak second or third languages while retaining its recognizable voice color. Actors can be dubbed in their own voices, speaking languages they never learned.
We work closely with our customers, engaging them at every step in the creation of custom voices. We know that every detail counts.
In view of all these differentiators, why should any company settle for generic, off-the-shelf voices? Please elaborate more on some of the features of your solution?
Speechmorphing simplifies the deployment and management of voice-enabled applications. Users can design their response system without understanding every detail of the process. Once that has been accomplished, system integrators can exploit our APIs, so that users can instantly access Speechmorphing’s public cloud. There, they’ll find all the APIs and tools they need to embed new custom (and branded!) voices into selected apps and IoT devices.
Because our architecture is modular, we can upgrade its elements without breaking or recreating the whole system. For the same reason, we can refine voices – branded, celebrity, and others – by adding new styles, tones, emotions, and attitudes without starting from scratch.
For customer care or enterprise support, we provide the Smorph™ rendering tool, which enables users to easily create a response, tune it for word pronunciation and overall speed, pitch, and loudness, and store it. For more refined tuning of pitch contours, pauses, and tones at the word level, we offer our Template Editor tool – which also enables the smooth handling of variable instantiation (e.g., “Your appointment is confirmed for
Our user-friendly AI-powered Smorph Voice-on-Demand service can assist businesses in developing unique voices, based on existing or fresh recordings of their brand ambassadors. Speed-to-market is boosted by our uniquely speedy implementation. And the affordable price and high quality of Speechmorphing’s voices expand the range of practical applications for synthesized speech.
Could you please share one of your recent customer success stories where you have enabled clients to overcome hurdles and attain desired outcomes with your innovative array of solutions?
To date, we have worked with many enterprises in the area of customer care. Use cases have included modern self-service IVR, branded virtual agents, and multilingual outbound voice messaging for accessibility (beyond text-based messaging). In all of these cases, our contextually appropriate voice styles and tones are helpful in numerous ways. In healthcare, for example, voice responses should be empathetic and reassuring; in banking, by contrast, they should convey trustworthiness.
For sports brands, the intonation of voice responses can be inspiring or empowering. In automobile assistants, warning messages related to low gas or engine issues should be stern. Again, these varied tones, styles, and attitudes modulate custom (branded) voices; and, as mentioned, each voice can be instantly re-deployed in multiple languages. As just one instance, we enabled one Quick Service Restaurant (QSR) to personalize the voice prompts of their digital employee by tuning and perfecting dialogues and interactions. This tuning improved their drive-thru experience and customer engagement, leading to increased drive-thru revenue.
What does the future hold for your organization? Any footprint expansion plans or platform enhancement strategies that you can shed light upon?
Speechmorphing has a bright future, with plans to considerably expand our sales and marketing operations. Our technology has matured: we continue to work on performance while scaling and streamlining the voice creation process. Exciting interface work is also underway: we’re creating tools to further aid customers from all industries to easily select appropriate voices, styles, tones, and attitudes, and we’re refining our much-demanded voice tuning tools for intuitive use by less-expert users