To build systems that emulate the natural language capabilities of humans from scratch is a long shot. What if we could leverage decades of research in this space and using simple API calls add such capabilities to the apps that we build? Thanks to conversational language understanding, it is possible today. In this episode, Liji Thomas talks about a cloud-based AI service that enables you to create your own conversational language models using state of the art technology.
Transcript
Alexa pause.
Voice assistants like Alexa and Siri can understand us. What's interesting is how it understands each of us. That's because the way I interact with it would be quite different from the way you would, but that is the beauty of the human language. It is so layered with meaning and emotion that even when a piece of text is syntactically accurate, it will fail in its purpose if the intent is misunderstood.
To build systems that emulate the natural language capabilities of humans from scratch is a long shot. What if we could leverage decades of research in this space and using simple API calls add such capabilities to the apps that we build? Thanks to conversational language understanding, it is possible today, but first, let me tell you why.
Conversational language understanding is an AI service that will predict the overall meaning or intent of a piece of text. Why is this helpful?
Let's say we are interacting with weather app using natural language text. I can ask the app, “what's the weather like today?”. The app would give me a response. You could ask for the same thing in a completely different fashion - “Today's weather forecast?” Or “Will it be Cold?” or “Do I need an umbrella?” The app understands that irrespective of how each of us may ask that question, it needs to provide us with the weather update. Now additionally, it could also take into consideration factors like our location or the time of our period.
Language understanding is fairly complex. This is why Azure cognitive services for languages provides a cloud-based AI service that enables you to create your own conversational language models using state of the art technology. Prebuilt, intelligent services can now predict the overall meaning or intent behind a piece of text. Using which you can build a response. Let me show you how.
So, let's try out some of the features we discussed in this episode with conversational language understanding. To get started we need an Azure subscription and once we have the Azure subscription we can go into the marketplace. We can search for language, which is the easiest way to track the language service and once we click ‘create’ we can provide details regarding our resource which includes the language service name, the location, the resource group etc. Submit, give it a minute or two and you will have the language service up and running. Two things to note from here is the key and the endpoint to important parameters we need to invoke the API in code because this helps to authenticate the API. After which we can pass our text string and we can see the responses of the API. You can parse the responses of the API, so that's one way of doing it. You can choose a preferred programming language C#, Java, Python, use the SDK support, call the NuGet packages in your code and then you can test out the capabilities of the features. Or, you can use our web portal, that's called Language Studio from Microsoft, so this is a portal which allows you to test all the natural language features that come with Azure Cognitive services, but before language. So, particularly the one that we are interested in today, is CLU conversational language understanding. To start with, we need to create a project. I've created a project here and within the project we would add intense utterances and entities. Let's see how.
So, to add an intent, it's pretty straightforward. You would click add and then you would type in the intent name and hit ‘add intent’. So likewise, I've created an intent called find session and we follow the same example as we did during episode. So, for each intent I can select the intent, I can add my set of utterances as well. So, I can add any number of utterances that are here. And then for each utterance again, you would go ahead, and you would label the entity in that utterance. So, this is the way that the model would know that this is a session name. Before that we need to add an entity or create an entity. So here on the side you can add an entity. You can add ‘n’ number of entities as you would like, depending on your specific use-case. I have added the session name, I can even probably add speaker name and then it's fairly simple. You would go into your entity. Your UI is very intuitive. It would let you select that and then mark it as a session name. So, let's do that again. You would select the Azure functions is a session and I want to label that as an entity. So this way your model is now aware of what the intent is, what the utterances are, and what the entities are as well. So this you would train it, you would create a training object... training job sorry, and then you would run the training job and you would test your model performance. Or once you're happy with that, you would go ahead, and you would deploy the model and then the final cycle is where you picked your deployment. You are going to see if you are happy with what you have received. So, I have, you know, done the training and deployment. I am just going to go ahead and just test my deployment. We'll just pass a sample text there and see the results. The text that I used is ‘would there be a talk on Azure AI’? The intent it has returned with a confidence score of nearly 78%. It has returned that it has understood that the intent is fine session. And it has also identified the session name, which is Azure AI. So, if I go back to my training and see how well I had labeled it. My test sample was, ‘would there be a talk on Azure AI’ which is not there in here. So that is the power of natural language understanding, but it was able to understand Azure AI as an entity because I had trained it, so the quality of the trained data does matter to a lot of extent. And as you analyze and you track the user’s utterances, if you see that it is not being able to identify an intent, or an entity as expected, you could always go back and retrain your models and test it, deploy it, and follow that life cycle again. But this is all what you can do with conversational language understanding. Projects, very simple to start with in language Studio and a lot of documentation in docs.microsoft.com, you'd want to read up and want to delve deeper into it.
From prehistoric times, human beings have interacted with each other using conversations. Then we built machines, powerful ones, but to interact with it, we needed to learn the language of the machines. This is why, conversational language understanding is a game changer. It brings a fundamental shift in the paradigm of human computer interaction. Because now you and I can interact with a machine, just about the same way, we interact with each other.