How To Orchestrate The Three IBM Watson Assistant Skill Types
And Using Each One To Their Strengths
While recently exploring the tight integration between Watson Assistant & Watson Discovery, I came to realize something. The three skill types within Watson Assistant complement each other well, in terms or Dialog, Search & Action skills. But these skills need to be orchestrated correctly, with the Dialog skill acting as the backbone of the conversational agent and facilitating disambiguation, auto learning, digression and negating fallback proliferation. And with Search & Actions facilitating flexibility, extendibility and intent deprecation.
IBM Watson Assistant is constituted by two main component. An Assistant and one or more skills. This story is about how to orchestrate multiple skills, and multiple skill types within an Assistant.
Each skill type, of which there are three, has a specific use-case; this can be extended of course. However, used out of place, can seriously impede the scaling of your chatbot.
Let’s start with the difference between an Assistant and Skills…
The assistant can be seen as the container of the conversational agent.
The assistant also houses the skills, and the assistant also facilitates the connectors to the integration mediums.
The Assistant direct requests down the optimal path for solving a customer problem.
By adding skills, your assistant can provide a direct answer to an in domain question or reference more generalized search results for requests more complex.
Here is a few key characteristics of an Assistant:
- The assistant integrates to various mediums; Facebook, Slack, Twitter etc.
- The assistant also house the different skills.
- An assistant can have a single or multiple skills.
- You can also think of skills as different elements representing different parts of an organization.
Watson Assistant make provision for three types of skills:
The three skill types mentioned needs to be used not only in a way they were intended to be employed, but skills can also be used in such a way where they compliment each-other.
The main skill is the dialog skill. All conversational agents will be anchored by one or more Dialog Skills.
The dialog skill allows for defining intents and entities, conversations are defined by a dialog tree.
A graphic dialog editor is available and scripting an also be sued. The response dialog is also defined here.
An Actions skill is really not intended to be used as a standalone skill. The actions skill is a quick way to augment and build extensions to an existing dialog skill.
If you add both a dialog skill and an actions skill to your assistant, the dialog skill is used. You can configure your dialog skill to process individual actions from your actions skill.
When no response is available from a dialog or action skill, the conversation can default to a Search Skill where this skill searches a body of data and retrieve a portion of information to present to the user.
Using Skill Types In The Right Way
In the following three sections we will dive into how these three skills should be used.
The best way to understand the specific implementation strategy is to create practical examples.
A Simple Approach Using IBM Watson Assistant
This example looks at creating a dialog skill which can handle multiple user intents.
Like all cloud based chatbot development environments, with Watson Assistant you can create a list of expected user intents.
These intents are categories to manage the conversation. Think of intents as the intention of the user contacting your chatbot. Intents can also be seen as verbs. The action the user wants to have performed.
Hence the user utterance needs to be assigned to one of these predefined intents. You can think of this as the domain of the chatbot. Below you can see an example of a list of intents defined and a list user examples per intent.
Typically the user utterance is tagged with one of these intents, even if what the user says, stretches over two or more intents.
Most chatbots will take the intent with the highest score and take the conversation down that avenue.
Already here you should see the problem, when an user utters two intents in a sentence.
Switch the lights on and turn the music down. Most chatbots will settle on one of the two intents in this sentence.
Intents are defined in most cases by a decimal percentage.
A decimal percentage that represents your assistant’s confidence in the recognized intents.
From the example her you can see that meeting intent is 81% and the time intent is 79%. So very close and clearly both need to be addressed.
And in other cases there might be more, yet most conversational environments will take to highest score to address, leaving the user with no other option than to retype the second intent, and hopefully with no other intents this time.
Dialog Configuration For Multiple Intents
There are simple ways of addressing this problem and helping your chatbot to be more resilient. Here I will show you a simple way of achieving this within the IBM Watson Assistant environment.
The Dialog Structure
I went with the simplest dialog structure possible to create this example. Here you can see some of the conditions within the image. The idea is for the conversation to skip through the initial dialog nodes and evaluate the conditions.
Watson Assistant’s dialog creation and management web environment is powerful and feature rich. It is continuously evolving with new functionality visible every so often.
Setting The Threshold
Within the second node we create the contextual variable named $intents and set it to zero. This we will use to capture all the intents gleaned from the user input.
The Intents we capture with this contextual variable later in the dialog will include all the intents.
You see we also create a contextual variable called $confidence_threshold.
This is set to 0.5. The idea is to discard intents with a confidence lower than 50%.
This threshold can be tweaked based on the results you achieve within your application.
In general you will see a clearly segregated top grouping and then the rest.
Getting The Intent Values
In the third dialog node we define three more context variables and assign values to them. Firstly we define a variable with the name $intents. Then we use the Value field to enter the following:
“<? intents.filter(‘intent’, ‘intent.confidence >= $confidence_threshold’) ?>”
To learn more about expression language methods, take a look at IBM’s documentation. We are only filtering the intents which are equal or more than the confidence threshold we set of 50%.
We are going to extract only the first two intents, as those are the ones we are interested in. For the first intent we define the variable first_intent and for value we use:
“<? intents.get(0).intent ?>”
This extract the first intent value from the list of intents. Then we create a context variable with the value second_intent and we assign the second listed intent value:
“<? intents.get(1).intent ?>”
You can see the pattern here, and so you can go down the list. You can also create a loop to go through the list.
Now our values will be captured via context variables within the course of the conversation.
These values can now be used to direct the dialog and support decisions on what is presented to the users.
This is one example of where we create a condition within a dialog and if it recognizes these two intents, the dialog is visited.
This is a mere illustration in the simplest form possible. For a production environment, the best solution would be to handle the intents separately and not in one dialog. Thus minimizing the options to make provision for.
Testing Our Prototype
Testing our prototype within the test pane shows how with a multi-intent utterance the intents are captured as contextual entities and used within the dialog. Thus allowing the bot to respond accordingly.
How To Use Actions
Firstly, Actions should be seen as another type of skill to complement the other two existing skills;
- dialog skills and
- search skills.
Actions must not be seen as a replacement for dialogs.
Secondly, actions can be used as a standalone implementation for very simple applications. Such simple implementations may include customer satisfaction surveys, customer or user registration etc. Short and specific conversations.
Thirdly, and most importantly, actions can be used as a plugin or supporting element to dialog skills.
Of course, your assistant can run 100% on Actions, but this is highly unlikely or at least advisable.
The best implementation scenario is where the backbone of your assistant is constituted by one or more dialog skills, and Actions are used to enhance certain functionality within the dialog. With something like a search skill.
This approach can allow business units to develop their own actions, due to the friendly interface. And subsequently, these Actions can then plugged into a dialog.
This approach is convenient if you have a module which changes on a regular basis, but you want to minimize impact on a complex dialog environment.
Within a dialog node, a specific action that is linked to the same Assistant as this dialog skill can be invoked. The dialog skill is paused until the action is completed.
An action can also be seen as a module which can be used and reused from multiple dialog threads.
When adding actions to a dialog skill, consideration needs to be given to the invocation priority.
If you add only an actions skill to the assistant, the action skill starts the conversation. If you add both a dialog skill and actions skill to an assistant, the dialog skill starts the conversation. And actions are recognized only if you configure the dialog skill to call them.
Fourthly, if you are looking for a tool to develop prototypes, demos or proof of concepts, Actions can stand you in good stead.
Mention needs to be made of the built-in constrained user input, where options are presented. Creating a more structured input supports the capabilities of Actions.
Disambiguation between Actions within an Action Skill is possible and can be toggled on or off. This is a very handy functionality. It should address intent conflicts to a large extend.
System actions are available and these are bound to grow.
How NOT To Use Actions
It does not seem sensible to build a complete digital assistant / chatbot with actions. Or at least not as a standalone conversational interface. There is this allure of rapid initial progress and having something to show. However, there are a few problems you are bound to encounter.
Conversations within an action are segmented or grouped according to intents. Should there be intent conflicts or overlaps, inconsistencies can be introduced to the chatbot.
Entity management is not as strong within Actions as it is with Dialog skills. Collection of entities with a slot filling approach is fine.
But for more advance conversations where entities need to be defined and detected contextually Actions will not suffice. Compound entities per user utterance will also pose a challenge
Compound intents, or multiple intents per user utterance is problematic.
If you are use to implementing conversational digression, actions will not suffice.
- Conversational topics can be addressed in a modular fashion.
- Conversational steps can be dynamically ordered by drag and drop.
- Variable management is easy and conversational from a design perspective.
- Conditions can set.
- Complexity is masked and simplicity is surfaced.
- Design and Development are combined.
- Integration with current solutions and developed products
- Formatting of conversational presentation.
- If used in isolation scaling impediments will be encountered.
- Still State Machine Approach.
- Linear Design interface.
Search Skills in Watson Assistant with Discovery
Here is a step-by-step implementation of a search skill and adding it to an assistant. Using search functionality within the IBM Cloud offering. In this example we are going to make use of IBM Watson Assistant. This in essence will constitute our chatbot.
Added to this we will also make use of IBM Watson Discovery.
In short, Discovery is an IBM Cloud service allowing you to upload data, which becomes a searchable body of data. This is very convenient and fast to convert existing data, in virtually any format, into a searchable form.
Documents like PDF, CSV, Word etc. can be uploaded and annotated. Custom Tags can be created for specific annotation. Very conveniently, as you annotate, Discovery use your annotation to make live predictions in the document. So for a this document, I only had to manually annotate about 10 pages. This obviously depends on how standard your document is.
For an organization and production environment it is advisable to organize the data in a JSON format, preferable with a heading, body and URL reference. This will make the results yielded by Discovery more predictable and tidy in all instances.
Here, I took the Dictionary of IBM and Computing Terminology (PDF, 313KB) document.
For the following reasons:
- It his not too much data to upload, only 95 pages.
- The format of the document is very structured throughout and the live predictions of the ML function made the process even faster.
Hence it makes for a convenient questions and answer model without having to rework the data into a specific format.
Getting Started With Discovery
There is an existing Data Collection in Discovery; Watson Discovery News. But we want to create our own one. So, click on Upload your own data…
The data-upload interface cannot be simpler, you can drag and drop your files, or select it.
You can see here the formats which provision is made for, PDF, HTML, JSON, Word, Excel, PowerPoint, PNG, TIFF, JPG and more. The language of the document needs to be defined, and this list is not vast.
In the cases where your documents are in another language, translation will be necessary.
When you upload large files, the process take quite a while. If the JSON structure is too complex or nested, Discovery fails. So try and simplify your JSON as much as possible.
Processing of data can take a while; keep an eye on the Errors and warnings to identify any problem areas in your data.
The annotation of data is crucial in having accurate results. This is a manual process, but is simplified by the predictive annotation.
This is a Machine Learning model, which in real-time, learns from your manual annotation and propagates this forward in the document. You will find yourself going from annotating, to review, to just skipping through the pages.
Test Search Your Document In Discovery
Lastly, search your document and test the results. The beauty of this is, you can search our data and documents making use of natural language.
Already in Discovery you can test your data making use of natural language understanding.
Watson Assistant ~ The Chatbot
Now we move to the chatbot portion making use of IBM Watson Assistant (WA).
WA allows for a assistant to be created. Within this assistant one or more skills can exist. Skills can be seen as different components or smaller chatbots which can be combined into one larger assistant.
Within a larger organization, you can have different departments working on different skills and then these skills can be combined into one larger assistant.
These skills can be a dialog, or a search skill.
Adding Search Skill
We have an existing customer care skill in our assistant. And now we are adding this additional search skill.
After adding a name and description to the search skill, the next window loads the Discovery instances available to WA. This can also take a while.
We do not want to create a new collection. However, the fact that we can launch from WA is convenient. But, we choose the collection we created earlier, called Custom Data.
Chatbot Data Presentation
The next window allows yo to set the data which will be presented in the:
- and the URL
You can also define a message which will inform the user about the source of the data, and that it is indeed a search result.
Define a message if you could not find any data, or should there be connectivity issues.
It is best practice to be as transparent as possible with a user.
Always announce it is indeed a bot, and not human. Announce when you return search result data which was not directly curated for that particular point in the conversation.
And, state when there is a connectivity issue, or when no results are return.
Watson Assistant Components
Once you have launched WA, there is an option to create an assistant. Define the name of your assistant, and a description. Preview Link we discuss later in this article.
Preview Link allows for the creation of a preview URL to be created and distributed for previews and testing. Changes to the underlying chatbot are reflected on the preview interface.
Going back to the assistant, you will see there are two skills which constitutes this assistant; a Customer Care skill, and the search skill.
The idea here is, if the Customer Care skill cannot address the user intent, then WA will automatically fail over to the search skill and yield an answer.
Here is a view of our Customer Care and Search assistant. Watson Assistant will serve the chat session from the Customer Care Sample Skill. If this skill cannot address the query, WA will fail-over to the search skill.
There are numerous options to deploy the assistant; like Facebook Messenger, Slack, Web Chat. For the purposes for illustration, we are going to use the preview link.
There are a few basic configuration options available to the assistant. Toggle the availability of the search skill, inactivity timeout, API Details and naming.
In this preview interface, you can see the “Where are your office located?” is addressed by the default customer care skill. The technical questions are addressed by search skill.
From the examples you should have a good idea on how these three skills can be employed. The search skill can be used in a standalone scenario where you want to create a searchable knowledge base. But this will run into scaling impediments when conversations needs to be specific.
Action skills can be used for a quick survey, or slot filling chatbot. The fact that actions are Watson Assistant’s first foray into end-to-end intent-less conversations is exciting. But these skills cannot handle complex dialog configurations, digression, disambiguation, auto learning etc.
Dialog skills should be the backbone of any conversation, augmented and complimented by search and action skills.