How To Add Search To IBM Watson Assistant Using Watson Discovery

How To Add Search To IBM Watson Assistant Using Watson Discovery

When Existing Dialog States Are Not Sufficient


Amongst others, there has been two general notions within the chatbot framework ecosystem.

The first is the deprecation of intents. There are four emerging approaches to the deprecation of intents.

An Example IBM Watson Assistant Search Skill integrated to Watson Discovery.

The second is the deprecation of the state machine. This is necessary to introduce a more flexible conversational flow. The leader in this space is currently Rasa.But there is another way to introduce more flexibility to a state machine driven dialog management environment; where all conversational paths and responses are pre-define‚Ķ

This is by introducing a feature where, if there is no intent detected with a high confidence, the dialog can default to search a knowledge base and respond with the result.

This is not something unique to any chatbot framework. NVIDIA Jarvis, which was released recently has integration examples to Wikipedia to serve as a knowledge base which can be searched. Other platforms like MindMeld, Rasa, Microsoft and more make provision for such functionality. Obviously these systems vary in complexity and implementation steps.

In this story we are taking a look at the ease with which IBM Watson Assistant can be integrated with Watson Discovery. Allowing for a search skill.

Starting In Watson Assistant

The basic architecture of Watson Assistant consists of two main parts; skills and an assistant.

The basic components of Watson Assistant.
  • The assistant can integrate to various mediums; Facebook, Slack, Twitter etc.
  • The assistant also house the different skills.
  • An assistant can have a single or multiple skills.

You can also think of skills as different elements representing different parts of an organization.

Watson Assistant make provision for three types of skills:

For this story we are going to make use of a single Assistant and with a single Skill, which will be a search skill. In the image below you can see that an empty skill named Test is displayed. The three available types of Skills are presented to select.

Here we can opt for the Search skill, and click on Add search skill.

An Assistant is defined and selected, but no skill is attached to it.

From here, name the search skill something descriptive and add an optional description.

Naming the newly created search skill and adding a description

Here we have our assistant, with a Search Skill defined. When used in parallel within the same assistant, the three skill types can really compliment each-other.

The Assistant named Test with the search skill now created.

Choose the Discovery instance to connect to from the dropdown. I only have one instance available, named Discovery-s5 as seen below.

Choosing from a dropdown the discovery instance to connect to.

Now comes the time to create your new collection within Discovery. I named this discovery collection Computer Terms and the reference document language needs to specified. Obviously provision is made for all the major languages. Should you have a requirement for a niche language or vernacular, this will be an impediment.

Name your collection and select the language of the documents.

The link between between Watson Assistant & Discovery is really seamless which is very convenient; not having navigation overhead around the IBM Cloud environment.

Moving on to Discovery…

Data Preparation in IBM Watson Discovery

For this demonstration I did not perform any document annotation. You can refer back to a previous story for the annotation process. The annotation of data is crucial in having accurate results. This is a manual process, but is simplified by the predictive annotation.

This is a Machine Learning model, which in real-time, learns from your manual annotation and propagates this forward in the document. You will find yourself going from annotating, to review, to just skipping through the pages.

Watson Discovery landing page where you can select data sources.

You can see here the formats which provision is made for, PDF, HTML, JSON, Word, Excel, PowerPoint, PNG, TIFF, JPG and more.

The processing of data once a document is uploaded can take a while.

The language of the document needs to be defined, and this list is not extensive.

In the cases where your documents are in another language, translation will be necessary.

Here we choose to upload a static PDF document with computer terms. Once you have uploaded your document, the processing of the data can take a while, depending on the size of the data set.

The Watson Discovery dashboard after the document has been processed.

The next step can be to interrogate your data. It is always a good indication of how the data is presented on common questions a user might ask. Discovery query language can be used, but I prefer the natural language query. The NLU query is a quick check on how the chatbot will behave.

Query Discovery with Natural Language Understanding

There are further training and advanced annotation can also be performed. For a production environment the most astute approach would be to pre-format the data, using an established JSON format. A data transformation process can be run to convert data into the correct format. Randomly uploading data might not be the most astute approach.

Back To Watson Assistant

There are numerous options to deploy the assistant; like Facebook Messenger, Slack, Web Chat. For the purposes for illustration, we are going to use the preview link.

For now we will use the preview link to complete the process by chatting to our Watson Assistant chatbot which will in turn retrieve the data from Watson Discovery.

Two general questions are asked, and a summary is presented for the topic, with an option to see more and a link to the relevant document.


This example illustrates how a conventional chatbot can be augmented by a body of searchable data. This can relieve the dependence on fallback intent, which invariably results in fallback proliferation.

Especially in a larger organization where a large amount of data exist, which can be made available to a conversational interface.