Monthly Archives: October 2016

Message shaping

While the cognitive ability of Conversation is what sets it apart from other chat bots, the skills in message shaping can even the odds. It is a common technique used in customer support when you need to give a hard message.

From a Conversation point of view, it allows you to dramatically reduce the level of work required to build and maintain. As well as improving customer satisfaction.

So let’s start with what is message shaping. Take this example: You own a pet store chat bot and the user says:

I wish to buy a pet.

You can start with “What kind of pet?”. Here you have left the users response too open. For a cognitive system, this on the face of it isn’t an issue as a well trained Conversation will handle this well.

But even if it does it still leaves you open to a lot of responses.

  1. “What pets are there?” – Simple answer. Minimal work.
  2. “What pet suits me?” – A process flow to determine this. Medium work.
  3. “I want to buy a panda” – An endangered species. Again possible simple answer.
  4. “I want to buy a puppy” – More details required. Medium work.
  5. “I haven’t thought about it” – Could be simple or complex.
  6. “A pet that is popular for millennials” – Now it starts getting crazy.

You will be driven to insanity trying to cater for every possible response coming back from the user. Even if you get a good response like “I want to buy a puppy” you may need to walk through to and fro, only to find that you don’t have that pet in the store to sell them.

So you can reduce complexity by taking control of the conversation. First you need to examine what the user hasn’t said. They haven’t said what kind of pet. This means they are unsure on what they want to get.

As a pet store owner, you know that certain pets are good for certain living conditions. So you can reduce and control the direction by saying something like:

I see you are maybe unsure about the pet you want. I can recommend a low maintenance pet, which are good for apartments or busy lifestyles. Or a high maintence pet which is good for families with a home.

Here you have given two options which dramatically narrow the scope of the next response from the user. It is still possible someone may go off script, but it is unlikely. If they do you can do a conversational repair to force them back into the script.

As the flow progresses, you can push the user to pets that you are trying to sell faster.

In doing so however it is important to understand that the end user must have free will, even if it is an illusion. For example if the person wants a puppy, it may be that a certain breed is not available. Rather then saying they can’t have that breed, offer other breeds.

If you give no options to the user, it leads to frustration. Even if given options which are not what the person wants, it is still better then no option. Actually if you shape your messages well you can give two options which lead to the same outcome, and the end user will still feel like they are in control.

Shaping through UI

Another advantage of message shaping is avoiding having to code complex language rules (Intents, Entities, Regex, etc).

For example:


Now you can see that the end user has not supplied all credit card information. You would need to code complex flows to cater for this. The information is clearly visible, and it could become a nightmare to parse to have it anonymised.

To solve all of this you can use the UI to force the user to structure their own data.


Watson Virtual Agent actually does this out of the box for a number of common areas.

Buttons are for apps, not for conversing.

For UI related prompts, try not to overdo it. For structured data it is fine. For buttons it can also be fine, but if you overdo it then it does not feel intelligent to the end user. As it starts to feel more like an application, the users have different expectations of the responses they get back.

Practise makes perfect

Don’t be fooled that this is easy to do. Most developers I have seen work with conversation fall back to looking for a technical solution, when only changing how you speak to the end user will suffice.

Even people working in support can take 6 months to a year to pick up the skills from no experience. Although it can be a bit harder having to do it on the fly versus creating a script.

For more reading on techniques, here is some stuff to get you started.

New System Entities feature

If you have gone into your conversation service since yesterday, you will find you have a new feature called System Entities. For the moment you only have number, percentage  and currency recognition.

For this introduction I am just going to use @sys-number entity to make Watson do simple math.

First let’s train Watson about the math terms we are going to use. For this we are going to use intents.


Why intents and not entities? Hands down Intents will mean very little training for it to understand similar terms. Also as system entities, are also entities they can interfere with any logic you put in. I use the number “42” so as to not bias the classification to a particular number.

Next we go to entities and switch on the @sys-number entity.

Now for dialog, first we want to make sure what the person has said is a valid math question, if not we say we don’t understand. We do it with the following conditional statement.

intents.size() >0 
AND intents[0].confidence < 0.30

This will ensure that the system only responds if it is confident to do so. Next we put another node in which checks to see if the system is unsure.

intents.size() >0 
AND intents[0].confidence < 0.60
AND entities.size() == 2

Now you will notice we are using entities.size(). This is because the numbers are entities, and @sys-number doesn’t have the size() method. We want to make sure that the end user typed in two numbers before continuing.

Now what we have done that, the procedure is more or less the same for each action, so here is just the addition.

The answer is <? @sys-number[0].numeric_value + @sys-number[1].numeric_value ?>

This takes the first and second numeric value and adds them. While conversation will recognise numbers from text and take action on them, it won’t always do this. So we have to use the numeric_value attribute.

While mostly fine, there are issues you won’t be able to easily cater for. For example the importance of the numbers location.

Take for the example the two questions which are the same, but will give very different answers.

  • What is ten divided by two?
  • Using the number two divide up the number ten.

One way to solve this is just to create a second division intent which knows the numbers are reversed, but more likely you can solve this with message shaping.

You will find a similar issue though when you start to use the other system entities. For example if you have @sys-percentage active, then “20%” is not only a percent, but it is also a number. This makes it tricker when trying to read the entity or @sys-number stack.

For what comes back from conversation, you will see that the entity structure has changed.


Now you have a metadata field which you can get the numeric value from.

As always, here is a sample conversation script.

The road to good intentions.

So let’s talk about intents. The documentation is not bad in explaining what an intent is, but doesn’t really go into its strengths, or the best means to collect them.

First the important thing to understand with intents. How Watson perceives the world is defined by their intents. If you ask Watson a question, it can only understand it in relation to the intents. It cannot answer a question where it has not been trained on the context.

So for example if I say “I want to get a fishing license” may work for what you trained, but “I want to get driving license” may give you the same response, simply because it closely matches and falls outside of what your application is intended for.

So it is just as important to understand what is out of scope, but you may need to give an answer to.

Getting your questions for training.

The strength of intents is the ability to map your customers language to your domain language.  I can’t stress this enough. While Watson can be quite intelligent in understanding terms with its training, it is making those connections of language which does not directly related to your domain is important.

This is where you can get the best results. So it is important to collect questions in the voice of your end-user.

The “voice” can also mean where and how the question was asked. How someone asks the question on the phone can be different to instant messaging. Depending on how you plan to create your application, depends on how you should capture those questions.

When collecting, make sure you do not accidentally bias the results. For example, if you have a subject matter expert collecting, you will find they will unconsciously change the question when writing it. Likewise if you question collect from surveys, try to avoid asking questions which will bias the results. Take these two examples.

  • “Ask questions relating to school timetables”
  • “You just arrived on campus, and you don’t know where or what to do next.”

The first one will generate a very narrow scope of test questions related to your application, and not what a person ask when in a situation. The second question is broader, but you may still find that people will say things like “campus”, “where”, “what”.

Which comes first? Questions or Intents?


If you have defined the intents first, you need to get the questions for them. However there is a danger that you are creating more work for yourself than needed.

If you do straight question collection, when you start to cluster into intents you will start to see something like this:


Everything right of the orange line (long tail) does not have enough to train Conversation. Now you could go out and try and find questions for the long tail, but that is the wrong way to approach this.

Focus on the left side (fat head),  this is the most common stuff people will ask. It will also allow you to work on a very well polished user experience which most users will hit.

The long tail still needs to be addressed, and if you have a full flat line then you need to look at a different solution. For example Retrieve & Rank. There is an example that uses both.

Manufacturing Intent

Now creating manufactured questions is always a bad thing. There may be instances where you need to do this. But it has to be done carefully. Watson is pretty intelligent when it comes to understanding the cluster of questions. But the user who creates those questions may not speak in the way of the customer (even if they believe they do).

Take these examples:

  • What is the status of my PMR?
  • Can you give me an update on my PMR?
  • What is happening with my PMR?
  • What is the latest update of my PMR?
  • I want to know the status of my PMR.

Straight away you can see “PMR” which is a common term for an SME, but may not be for the end-user. No where does it mention what a PMR is.  You can also see “update” and “status” repeated, which is unlikely to be an issue for Watson but doesn’t really create much variance.

Test, Test, Test!

Just like a human that you teach, you need to test to make sure they understood the material.

Get real world data!

After you have clustered all your questions, take out a random 10%-20% (depending on how many you have). You set these aside and don’t look at the contents. This is normally called a “Blind Test”.

Run it against what you have trained on and get the results. These should give you an indicator of how it reacts in the real world*. Even if the results are bad, do not look as to why.

Instead you can create one or more of the following tests to see where things are going weird.

Test Set : Similar to the blind test, you remove 10%-20% and use that to test (don’t add back until you get more questions). You should get pretty close results to your blind test. You can examine the results to see why it’s not performing. The problem with the test set is that you are reducing the size of training set, so if you a low number of questions to begin with, then next two tests help.

K-fold cross validation : You split your training set into random segments (K). Use one set to test and the rest to train. You then work your way through all of them. This method will test everything, but will be extremely time-consuming. Also you need to pick a good size for K so that you can test correctly.

Monte Carlo cross validation : In this instance you take out a random 10%-20% (depending on train set size) and test against it. Normally run this test at least 3 times and take the average. Quicker to test. I have a sample python script which can help you here.

* If your questions were manufactured, then you are going to have a problem testing how well the system is going to perform in real life!

I got the results. Now what?

First check your results of your blind test vs whatever test you did above. They should fall within 5% of each other. If not then your system is not correctly trained.

If this is the case, you need to look at the wrong questions cluster, and also the clusters that got the wrong answer. You need to factor in the confidence of the system as well. You should look for patterns that explain why it picked the wrong answer.

More on that later.


Building a Conversation interface in minutes.

I come from a Java development background, but since joining Watson I’ve started using Python and love it. 🙂 It’s like it was made for Conversation.

The Conversation test sidebar is handy, but sometimes you need to see the raw data, or certain parts that don’t show up in the side bar.

Creating a Bluemix application can be heavy if you just want to do some testing of your conversation. Python allows you to test with very little code. Here is some easy steps to get you started. I am making an assumption you have

1: If you are using a MAC you have python already installed. Otherwise you need to download from

2: Install the Watson Developer Cloud SDK. You can also just use Requests, but the SDK will make your life easier.

3: In your conversation service, copy the service credentials as-is (if you are using the latest UI). If it doesn’t look like below, you may need to alter it.


4: Go to your conversation workspace, and check the details to get your workspace ID. Make a note of that.

5: Download the following code.


The “ctx” part just paste in your service credentials, and update the workspace ID. The version number you can get from the Conversation API documentation.

6: Run the Python code. Assuming you put in the correct details, you can type into the console and get your responses back from conversation. Just type “…” to quit.



Migrating from NLC and Dialog

For those of you who have been using NLC, you may be asking “Why bother to migrate?”. Well one advantage is that you can download the questions you put in.

To migrate is painfully simple. Import your CSV file and you are done. Don’t touch the dialog section. Then for your JSON make sure you have alternate intents enabled.


    "text":"Hello World"

You will get your intents back in a JSON array. Maximum intents returned will be ten.


Migrating from Dialog

This is where things get tricker. Dialog and Conversation are very different systems. Dialog had no machine learning, but had extremely complex NLP in it which Conversation does not yet fully mimic.

Here is the areas of Dialog and how they compare.


Conversation does not have folders in the same way Dialog does. You can create a node with a conditional statement of “false”. Use the output  part to name your folder. Tree traversal will skip it (in Dialog it will traverse into it). You can use Continue from to jump into the folder.


Output Node

In Conversation Output and Input nodes are part of the same node. You can chain nodes together to construct multiple output similar to dialog. You do this by setting a continue from to the next nodes output and so on. You can also generate multiple random/sequential responses as dialog did. You can read more details about this in the advanced output documentation.

Get User Input

conv0210-2 This icon is similar to the Get User Input. Conversation however at this time does not have Dynamic Node Resolution (DNR) functionality. So if the tree is traversed it will return to the root once completed, and not back to the last get user input.

Search Node

Conversation currently does not have this functionality. You can mimic it at the application layer by passing back in the previous nodes visited ID to jump back to an earlier part in the tree.

Default Node

At the root level of the tree you have “Anything else” node that is automatically created. For branches of the tree, you create a node with a condition of “true”. These kinds of default nodes are more important in Conversation, as if you do not hit a conditional node, then it will fall back to the root to find the answer.

Input Node

As mentioned earlier input and outputs are merged into one node. Variations that exist in Dialog do not exist in Conversation. To emulate these you can build multiple regular expressions. But get into the habit of using Intents and Entities. Intents use machine learning to match questions never seen before.

Goto Node

Conversation uses “Continue From” which is very similar. I detail how it works in “Understanding how a Conversation flows“.

Profile Check

This is part of the conditional section of the conversational node.


Conversation does not have concepts. Intents will learn new terms from what it is trained on. Conversation entities can be used in a similar way to concepts, but get used to using intents.

Function Node

Conversation does not have this functionality as it is stateless.

Random Node

Conversation does not have this functionality, but you can mimic it. First create a folder with the nodes you want to randomly hit. Give each node a conditional against a context variable to see if it matches a certain value. Then in your firing node, create something like the following.

  "output": {
    "text": "Finding random response."
  "context": {
    "random": "<? T(java.lang.Math).random() * 5.0 ?>"

This will give you a number from 0-5 with which to check against as follows.


Here is a sample conversation script demonstrating it.

Dialog Entities

Entities in Dialog can be quite complex. For example you can have nested entities,  concepts and regular expressions. As well as system entities which can recognise dates, locations, time, etc. Conversation doesn’t have this functionality yet.


Conversation does not have this functionality. You can mimic this by using Entities, but it is not recommended as it will be used as part of the training. Another way is to have the application layer intercept the text and replace out constants.


Conversation does not have auto-learn functionality. You would need to mimic this at the application layer.