In this blog, we discuss NameDropper, an AI-powered app created by our team that aims to reinstate the dignity associated with having one's name pronounced correctly, right from the first time.
What’s in a name?
Your name is, arguably, the most distinctive part of your identity. However, individuals, especially those with non-English names, often face mispronunciations, unintentional name microaggressions, and linguistic racism in various aspects of life. Failing to invest time and effort in learning to pronounce each other's names correctly sends a regrettable message that one person's identity might be deemed less significant than another's. No one should feel compelled to adopt an Anglicized nickname just to relieve others from the responsibility of learning their actual name.
That’s why we created NameDropper, an AI-powered app aiming to reinstate the dignity associated with having one's name pronounced correctly, right from the first time, and consistently so.
The tech that drives the impact: Azure AI/ OpenAI
When we first started building the app, we designed the system to allow for user-uploaded audio recordings but soon realized that it was not a scalable solution. NameDropper was born out of the need for a trusted advisor in a global hybrid workplace. We expanded from support for just employee names to any name you type in.
As someone with a name that is often mispronounced, NameDropper is a lifesaver! It provides an opportunity for people to learn how to say my name, saving me from constant corrections and giving sense of identity.
- Piergagon Coulibaly, Data & Ai Consultant, Valorem Reply
The core functionality of the app is based on two main Azure AI services: Azure AI Speech services and Azure OpenAI.
Azure AI Speech services
One of the lesser-known features of Azure AI’s Text to Speech services is its ability to support a multitude of dialects and locales. This feature allows for a more accurate vocal representation of written text. This dialect and locale support is accomplished through the inclusion of over 110 voices in more than 45 languages and variants. Therefore, a text can be converted to speech not just in a specific language, but also with the accent and intonations unique to the local dialect. This is particularly beneficial in applications requiring global reach or ones that are tailored to a certain linguistic group.
Moreover, Azure AI's ability to recognize and adapt to specific locales means that the service can handle variations in language based on geography, further enhancing its utility and versatility. For example, English text can be rendered in a US, UK, Australian, or Indian accent, among others.
Another feature of Azure AI’s Text-to-Speech services is Neural Text to Speech (NTTS) technology that makes the speech sound natural and human-like. This helps in reducing language barriers and enhancing user experiences by providing more natural and engaging interaction.
NameDropper leverages Azure AI’s Text-to-Speech services' support for numerous dialects and locations as well as realistic voice synthesis to accomplish its objectives. Currently, NameDropper supports 13 different languages with dialects to provide broader accuracy and enunciation.
Azure OpenAI
So, Speech was solved – but that was only half of the problem. The most popular request we received from our pilot users was for the addition of phonetic spellings. This would make the app way more accessible. But this is typically a huge challenge, and most libraries are built around the International Phonetic Alphabet (IPA), which is based on the Latin script and isn’t immediately “readable” to everyone.
Through Azure OpenAI, we took a novel approach and used its advanced language capabilities and generative powers to help create phonetic spellings of names based on the chosen dialect. In leveraging Azure OpenAI, we were able to use an incredibly small amount of code to solve this. Here’s how we did it:
private static List<ChatMessage> FewShotLearningMessages()
{
var system = new ChatMessage(ChatRole.System,
$@"You translate people's names into phonetic ipa characters, based on the given dialect, then you translate the ipa characters into English character phonetic spellings. NEVER display the phonetic ipa. DO NOT use non-English characters."
);
return new List<ChatMessage>
{
system,
new ChatMessage(ChatRole.User, "Cynthia Michael, English (United States) dialect"),
new ChatMessage(ChatRole.Assistant, "SIN-thee-uh MY-kuhl"),
new ChatMessage(ChatRole.User, "Raul Gonzalez, Spanish dialect"),
new ChatMessage(ChatRole.Assistant, "rah-OOL gohn-THAH-leth")
};
}
Instead of using an IPA library, we effectively created our own custom solution via a single prompt and a couple of few-shot examples.
Putting it together
So, then here’s the entire process, end to end-
- The user provides a name to the app.
- The name is sent to the Text to Speech service. At the same time, the name and dialect are sent to OpenAI along with a small payload of instructions for converting the name into its correct phonetic spelling – taking special note of the dialect as an accent marker. Using OpenAI's language model, the name is converted into a phonetic transcription following a standardized system like the International Phonetic Alphabet (IPA).
- Once both services have completed execution, the audio from Text to Speech and text from Azure OpenAI are returned.
- The audio is played back to the user and the phonetics are displayed on the screen. This output can be used to teach users the correct pronunciation of the name.
Responsibly built with care
Namedropper is built around Microsoft’s six core principles of responsible AI:
- Fairness: Namedropper strives to include a diverse range of voices and accents. We plan to continue adding more voices over time, promoting fairness in representation.
- Reliability & Safety: Namedropper prioritizes user safety and strives for constant reliability. We utilize enterprise-grade Azure AI services with no reliance on third-party systems that may account for interruptions.
- Privacy: We prioritize user privacy by ensuring no personally identifiable information (PII) is tracked or stored. Only user-approved audio recordings of pronunciations are stored in the database for future retrieval.
- Inclusiveness: This is a key focus for us. However, biases can exist in voice selection and text-to-speech synthesis, particularly regarding gender-neutral voices. Azure Speech services provide the ability to customize the voices, and future updates are expected to address this concern.
- Transparency: Namedropper provides transparency to the user, making it clear that the output is AI-generated pronunciation. This ensures that users are fully aware of and in control of the experience- choosing to accept or ignore the AI recommendation. A separate section on “Bias & Limitations” is included as part of the app.
- Accountability: The software is designed to solve a real-world problem - facilitating correct name pronunciation. Our aim is to strengthen relationships and alleviate any burdens around pronunciation in personal and professional environments. We hold ourselves accountable for delivering on these objectives and are constantly improving upon them.
My middle name, Jose, has different pronunciations in Spanish and Indian. And NameDropper just gets it! It's incredibly simple and easy to use- just like I would ask a friend- 'Hey! How do you say that name?
- Arun Jose Mathew, Digital Insights Consultant, Valorem Reply
The Riad Ahead
Technology is by no means a replacement for genuine interaction, and our hope is that NameDropper will actually encourage closer relationships by reducing the friction that tends to hold us back from asking questions about each other. Shifting from “They’ll correct me if I say it wrong” to “I want to learn how to say their name” is a seemingly small but immensely significant change that conveys care and respect for someone's identity. When used correctly, NameDropper has the potential to deepen our relationships and understanding of one another and our diverse cultures, and we hope you will take advantage of that opportunity.
Looking ahead, we have plans to make NameDropper even more inclusive. Azure Cognitive Services supports 140 languages and has over 400 neural voices. Our goal is to enhance the diversity of these voices, striving for more gender inclusivity and representation of non-binary voices. We aim to provide users with a more personalized experience.
In a multicultural society, names can originate from various cultures, and while humans often excel at spotting and pronouncing them correctly, current AI systems require further improvement in this aspect. We are actively working on these enhancements to elevate NameDropper's capabilities. Our aspiration is to craft a tool that not only accommodates but celebrates the names that mirror our diversity.
If you’d like to learn more about how you can use Azure AI to solve your use cases, reach out to us. Bring us your toughest problems. We specialize in delivering creative solutions.