Multi-Lingual chat bot with cloud functions.

Bonus post for being away for so long! 🙂 Let’s talk about how to do a multi-lingual Chatbot (MLC).

Each skill is trained on its own individual language. You can mix languages into a single skill, however depending on the language selected the other is treated as either keywords or default language words. This can be handy if only certain language words are commonly used across languages.

For languages like say Korean or Arabic, it gives a limited ability versus say English. For something like English with Spanish, it simply does not work.

There are a number of options to work around this.

Landing point solution

This is where the entry point into your chatbot defines the language to return. By far the easiest solution. In this case you would have for example an English + Spanish website. If the user enters the Spanish website, they get the Spanish skill selected. Likewise with English.

Slightly up from this is having the user select the language at the start of the bot executing. This can help where the anonymous user is forced to one language website, but can’t easily find how to switch.

The downside for this solution is where end users mix language. Somewhat common in certain languages. For example Arabic. They may type in the other language only to get a confused response from the bot.

Preparation work

To show the demo, I first need to create two skills. One in English and one in Spanish. I select the same intents from the catalog to save time.

I also need to create Dialog nodes.. but that is so slow to do by hand! 🙁 No problem. I created a small python script to read the intents and write my dialog nodes for me like so:

Here is the sample script for this demo. Totally unsupported, and unlikely to work with large number of intent. But should get you started if you want to make your own.

Cloud functions to the rescue!

With the two workspaces created for each language we can now create a cloud function to handle the switching. This post won’t go into details on creating a cloud function. I recommend the built in tutorials.

First in the welcome node we will add the following fields.

Field Name Value Details
$host “” Set to where you created your cloud function
$action “workspaceLanguageSwitch” This is the name of the action we will create.
$language “es” The language of the other skill.
$credentials {“user”:”USERNAME”,”password”:”PASSWORD”} The cloud function username + password.
$namespace “ORG_SPACE/actions” The name of your ORG and SPACE.
$workspace_id “…” The workspace ID of the other skill.

Next we create a node directly after the welcome one with a conditional of “!$language_call” (more on that later). We also add action code as follows.

The action code allows us to call to the cloud function that we will create.

The child nodes of this node will either skip if no message comes back, or display the results of the cloud function.

On to the cloud function. We give it the name “workspaceLanguageSwitch”.

This cloud function does the following.

  • Checks that a question was sent in. If not it sends a “” message back.
  • Checks that the language of the question is set to what was requested. For example: In the English skill we check for Spanish (es).
  • If the language is matched, then it sends the question as-is to the other workspace specified. It also sets “$language_call” to true to prevent a loop.
  • Returns the result with confidence and intent.

Once this is created, we can test in the “Try it out” screen for both Spanish and English.

Here is all the sample code to try and recreate yourself.

It’s not all a bed of roses.

This is just a demo. There are a number of issues you need to be aware of you take this route.

  • The demo does a one time shot. So it will only work with Q&A responses. If you plan to use slots or process flows, then you need to add logic to store and maintain the context.
  • Your “Try it out” is free to use. Now that you have a cloud function it will cost money to run tests (all be it very cheap).
  • There is no fault control in the cloud function. So all that would need to be added.
  • Cloud functions must complete execution within 5 seconds. So if your other skill calls out to integration, it can potentially break the call.

Simple Intent Tricks

As usual it’s been a while since I’ve updated the blog, so as to keep it alive here is a quick update. For this post I’m going to show two simple techniques using the intents[] object over the #Intents reference.

As always I recommend that you only implement these if there is evidence it is needed.

Repeating Questions

A common issue with end users is that they don’t always understand the answer they get. It is very common for the user to gloss over what has been shown to them, and they can miss the actual answer they wanted.

More often than not, the user just doesn’t read the answer. 🙂

By default the user asks the question in a slightly different way, gets the same intent and asks again. One of the ways to detect how to work on answers is to look for this pattern in the logs.

Another option is to change up the conversation like a human would. To this end, we will detect if they got the same answer. First we create a $last_intent context variable, and check that like so:

Now if this node doesn’t trigger it’s important to clear the $last_intent

But this node needs to jump back to the branch to continue on. You will notice that I created a dummy node to jump to. This is a good coding convention. You put any further nodes below the jump node. This prevents the jump logic breaking if you add nodes.

Another thing to notice is the Answer General Questions condition logic:


This allows you to group similar context intents into a single node. Normally most people will just pick the related intents for the condition block. So you end up with something like this:

The problem with above is that you will be more prone to making mistakes. While some of you might have spotted the “and” mistake, what is less noticeable is the missing intent #General_Ending. You can spend ages wondering why your message isn’t being displayed despite being in the node. With the one line code earlier, you don’t have to worry about any of this.

Here is the sample skill to play with.

Compound Questions

Next up is compound questions. I discussed this before, but from a code perspective. This example we are going to try doing it from within the skill itself.


  • Watson Assistant Plus already has this feature.
  • Due to how skills work, this example cannot exceed 50 intents. Watson Assistant will disable the dialog logic if it hits the same node 50 times. This is to prevent a possible endless loop.
  • This will likely not work if you have any slots.

If you need it for more than 50 nodes then you will need to do it with code, or WA Plus (which is much easier).

First we start by creating a multi-response node (main condition is anything_else). The first response you should edit in the advanced tab and set as follows:

The condition block is just a way to see if the first intent and second intent are close. There are a number of ways to do this, for example K-Means, or difference in percent rather than scaled. While K-Means is not easily done in WA, this method works but tends to be a bit more sensitive. So play around and see which you prefer.

Once the condition is hit we set a $answer_counter to 1 and $compound_found to true.

If it’s not hit we just set $compound_found to false. We don’t need to worry about the $answer_counter as it will always be 0 when in this position.

For the intent matching in the dialog nodes you cannot use the #Intent shortcuts. Instead you do the following:

    intent[$answer_counter].intent  == "General_Greetings"

If the counter is set to 1, it will respond with the second intent first. Then we decrement that counter and loop through all intents again. You end up with something like this:

Again, here is a sample skill:

What did you say?

A question recently asked was “How can I get Watson Conversation to repeat what I last asked it”? There are a couple of approaches to solve this, and I’d thought I would blog about them. First here is an example of what we are trying to achieve. 

One thing to understand when going forward. Everything you build should be data driven. So while there are valid use cases where this is needed, it doesn’t mean it is needed for every solution you build, unless evidence exists otherwise. 

Approach 1. Context Variable.

In this example we create a context variable at every node where we want the system to respond, like so: 

This works but prevents easily creating variations of the response. On the plus side you can give normal responses, but when the user asks to repeat, it can give a fixed custom response. 

Approach 2. Context variable everything!

Similar to the last approach except rather than creating the context variable in the context area, you build on the fly. So something like so:

This allows you to have custom responses. A disadvantage (all be it minor) is you are increasing the chance of a mistake in your code happening. Each response is adding 4 bytes to your overall skill/workspace size. This means nothing for small workspaces, but when you are enterprise level you need to be careful. 

I’ve attached a sample of above

Approach 3. Application layer.

With this method your application layer keeps a previous snapshot of the answer. Then when a repeat intent is detected, you just return the saved answer. 

I’ve seen some crazy stuff of resending question, modifying node maps, but really this is the simple option. 

Tips for 1st time WKS users.

“Tools are the subtlest of traps”

Most developers learn the dangers of evil wizards, for everyone else it might not be so obvious. In the Watson tooling the purpose is democratize AI, that is to abstract the AI layer from your knowledge worker.

This allows you to utilize the power of AI, without having to search for the mythical person who understands your business and NLP, AI, etc.

While this can remove a lot of the complexity, it can also lead people into a false sense of security that their hand will be held through the whole process.

So I am taking a time out to talk about Watson Knowledge Studio. For those that don’t know what it is. The tool allows you to annotate your domain documents, and surface structured insights from unstructured data. It is extremely powerful and easy to use versus other solutions out there.

The downside is that it is extremely easy to use. So I have a number of different people/companies rush in and create a model that disappoints, or in some cases infuriates. It’s not that this is unique to WKS, only that you can overlook important steps in your workflow.

Now IBM does do 3-4 days training, and there are a number of videos (slightly out of date) that cover some of this. But to help some people starting off, I am going to list the main pitfalls you need to watch out for when doing your first WKS project.

Understand what you need to surface from your data!

Screen Shot 2018-11-25 at 11.48.38 AMThis happens so often with technical people. They look at the tooling, see how easy it is and run off and annotate the world in their documents.

What normally happens in this regard is a very poor model which surfaces information correctly, but that data is meaningless to the business.

Get your business analysts/SMEs from the start.

You need someone to objectively understand what is the business problem you are trying to solve. You need to look at your data sources and determine if you can even surface that information (ie. enough samples to train).

Limit your Types and Relationships on your first pass.

After you have looked at what you want to surface, you need to focus on a small few number of types and relationships. Your BA/SME might have picked 50-100, but generally you should pick in around 20-40. The are reasons for this.

  • Each type/relationship adds more work for your human annotator.
  • Models can build faster.
  • As you work through documents you will find that your needs for types/relationships may change.

Don’t reinvent the wheel.

If you have existing annotators that will work as-is, don’t try and integrate them to your model. They may be part of your business requirement, but all you are doing is adding complexity to your model. You can run a second pass on your finished data to get that information.

Understand when to use rule based versus model based.

The purpose of the AI model is to have it train and understand content it has never seen before. To do this requires a lot of up front work on annotation and training.

Compare this to the rule based model. If you know new terms/phrases may not come up, but the nature of how they may be written changes, then rule based may solve your issue.

Personally the AI model is the better choice if you plan to go with WKS. There is easier tooling for rule based. For example Watson Explorer Studio.

Inter-Annotator Agreement is King.

Screen Shot 2018-11-25 at 11.49.53 AMTwo things to realize before you start annotating.

  • Just because you are an expert in the content, doesn’t mean you are an expert at annotating.
  • The more subject matter experts you have, the less agreement on topics will happen in the real world.

To that end, you need to clearly define your inter-annotator agreement (IAA) so there is no ambiguity or disagreement. Have examples, and also have a single SME as the deciding factor where further disagreements occur.

Not creating a proper IAA can lead to more work to your main SME, and damage your model to the extent of hours/days of wasted work.

Data Wrangling is required.

Most of the work in formatting your data is to keep your human annotator sane. Annotating a document is a mentally exhausting process that normally follows these steps.

  • Read and understand the paragraph.
  • Annotate the paragraph with types and relationships.
  • Read and annotate the co-references.
  • Fix mistakes as you go.

You want to reduce the amount of time to do this for each document, and working set. So if there is information that isn’t required to annotate, remove it. If your document is very small, then join together (with some clear marker of a new document).

You want your document to be to be annotated in 30 minutes or so, and your document set in a day. This will allow you to progress at a reasonable speed, and build frequent models.

On top of this, you should also look at sourcing any dictionaries/terms that can be used to  kickstart the annotation. (which most people do)

Lastly, check to see how your documents are ingested into WKS. For example I’ve seen instances of “word.word”. WKS sees this as a single term, and fixing that annotation can be annoying. It may be you need to do some formatting, or limit these mistakes.

Build your model soon and often.

You can’t really see what you are doing wrong until 2-3 models in. So it is important to build these models as soon as you can.

To that end I would recommend building a model as soon as you have a working set completed. Try to have sets be annotated within 1-2 days max, at least at the start of the project.

You can quickly see where the IAA is lacking, and if you need to change types/relationships or even data. Doing this sooner than later prevents technical debt of fixing the model.

Let the model work for you.

Screen Shot 2018-11-25 at 11.56.29 AMOnce you have gotten 3-4 models created, and you are comfortable with some of the scoring, have it pre-annotate future working sets. It will reduce the mental requirements for the human annotators.

However! If you still have 1-2 of the three areas performing badly, recommend to your human annotator to just delete the poor performing part and redo. For example if Co-Reference doesn’t work well, just delete all co-references and redo. This is considerably faster than trying to manually fix every annotation error.

So I hope this helps those in their first journey into using WKS. Be aware that this is by no means a full tutorial.

Optimizing intents with no K-Fold

So as I blogged about earlier, to help find conflicting training questions in your intents you would normally use K-Fold. There are issues in using this though.

  • It removes some of your training data that can weaken intents.
  • You have to balance number of folds for accuracy over speed.
  • Requires to make multiple workspaces.
  • Once you have created the results, you have to figure them out.

Looking back at an old blog post on compound questions, I realized that this already sees where a question can be confused with different intents.

So the next step was to determine to equate this to intent testing. Removing a question and retraining is not an option. It takes too long, and offers nothing over the existing K-Fold testing.

Sending the question back as-is will always return a 100% match. Every other intent gets a 0.0 score, so you can’t see confusion. But what if you changed the question?

First up, I took the workspace from the compound questions. Despite being a tiny workspace it works quite well. So I had to manufacture some confusion. I did this by taking a question from one intent and pasting it into another (from #ALLOW_VET to #DOG_HEALTH).

  • Can my vet examine the puppies?

Next up for the code we define a “safe word”. This is prepended to any question when talking to Watson Assistant. In this case I selected “SIO”.  What was interesting when testing this is that even if the safe word doesn’t make sense, it can still impact the results on an enterprise level (ie. 1000’s of questions).

We end up with this response:

  • SIO Can my vet examine the puppies?
  • ALLOW_VET  = 0.952616596221924
    DOG_HEALTH = 0.9269126892089845

Great! We can clearly see that these intents are confused with each other. So using the K-Means code from before we can document actual confusion between intents.

I’ve created some Sample Code you can play with to recreate.

Just in case you are not aware, there is also a Watson Assistant premium feature which will do something similar. It’s called “conflict resolution” (CR). Here is a good video explaining the feature.

Compared to that and K-Fold…


  • This sample code is faster and cleaner than K-Fold.
  • Narrows down to an individual question.
  • (with customizations) Can show multiple intent confusion.
  • Unlike K-Fold, later tests only require you to test new & updated intents.


  • Considerably slower than CR. When using CR, it only takes a few seconds to test everything. The sample can take up to a second per question to test.
  • Costs an API call for each question test.
  • Can’t test the model where the question would not be in the training. (What K-Fold does)
  • CR is slightly more accurate in detection. (based on my tests)
  • CR allows you to swap/edit questions within the same UI. The example I posted requires manual work.


Meet the Context Object

Hi I’m the Context object, and today we are going to learn some tips and tricks about me.

Don’t forget me!

For all you people starting with Watson Assistant (WA), you might not know me. In fact nearly everyone forgets about me for the first application they create.

WA is a stateless system. That means without me it cannot understand where the user left off in the conversation. So it’s very important to send the updated version of me back to WA on your next call.

If your conversation is repeating the same thing over and over, then this is probably why.

I belong to one workspaceonelove

It’s possible to create multi-workspace applications. But a common mistake is thinking that I can be used by multiple workspaces. When I move to a new workspace, I will generate a new conversation ID, and my system map will be meaningless to the new workspace (even if a copy).


It’s important to understand each of my internal pieces and what they do. I’ll talk a little bit about the input and output parts later.


This is a unique identifier assigned to mark all messages belonging to the same conversation. If you don’t supply this, or you supply an invalid one then a new one is generated for you. It has no other use except for logging.


This is what WA uses to understand where it is in the current conversation. I won’t go into details, because it’s undocumented. One tip though! If any variable starts with an underscore (example: “_node_output_map”), this means it’s internal structure is fixed. So any hacks you do against this variable will likely fail on any updates.

Context.branch_exited & Context.branch_exited_reason

This will tell you if WA had to leave the branch in order to find a response for the user. It will also explain why it had to exit.


This is a special context variable. Anything you store in this object you can hide from the conversation logging.

Very handy for storing passwords. Just be aware that if you use any of the values elsewhere they can be seen in the logs.

Everything else.

All other objects are created by you! I use a SPeL engine which analyses the variable and change it as I need it. So be careful on what you do. For example:

  • 10 + "10" = 2

You can embed code blocks into the variable values, but regardless of what it returns, it must be wrapped in a string.

  • Will work:  "id_number": "<? 10 + 10 ?>"
  • Won't work:  "id_number": <? 10 + 10 ?>

Code blocks will only work in WA. They are treated as text if you send them from the application.

Proper feeding.sick0811.png

It’s important to not overfeed me. While I can hold a lot of information, it is not good to make me the session storage for your application. There are a few reasons for this.

  1. Causes more network traffic.
  2. Increases logging sizes.
  3. Doesn’t scale very well

To ensure you keep me nice and fit you can do the following.

One time context variable outputs.

If I need to pass a context variable to the application but I don’t want to see it again, then you should put it into the Output object of the response.

Object grouping

If you need to send related variables, group them within an object. For example:

Without Object grouping With Object grouping
"context": {
    "name": "Bob",
    "id": 12345,
    "order_id": 67890

"context": {
    "user": {
        "name": "Bob",
        "id": 12345
    "order_id": 67890

This allows you to drop a large number of context variables with ease when they are no longer needed.

Variable requests

If you have a huge number of potential context variables, then you can use the request model to pull in just the variables you need from the application.

Request (from WA) Response (to WA)
"context": {
  "request": "name,id,order_id"

"context": {
    "name": "Bob",
    "id": 12345
    "order_id": 67890

Call out to cloud functions.cloud0811.png

Abstracting your data calls away from your application layer allows you to slot in and out updates without changing your orchestration layer. You are under a time limit to get the data, but if you can stay within 5 seconds, then this can be a better way to retrieve data and act on it.

Remembering the good times.photo0811.png

As a general rule, if you need to jump around the dialog tree, you should use “Jump to”, “Skip” and “Digressions”. There are approaches to get me to do it for you though.

Use these patterns with caution though, as you are moving conversation logic to the application layer. This can cause tight cohesion and more prone for bugs appearing later on.


This is easy enough. You just take a backup of the system object. Then overwrite the existing system object when you want to revert.

Forced jumps

This is a little tricker. You need to first traverse the tree to each jump location, then store the system objects. You can these use these system objects to jump to those areas. I would say use digressions instead if at all possible. This requires having to remap every time there is a change to the workspace.

… and there you have it. Look after me, I’ll make sure everything runs fine!

Making a Statement.

A lot of chat bots focus on answering the users question. Which is great, but it still can make it feel a little bit clinical. You can mitigate this by defining the personality, tone and positioning of the system.

But there are nuances of conversation that can make it feel more human when talking to it.

Take this example:

Screen Shot 2018-08-08 at 5.19.25 PMThe user got an answer, but they were not really asking a question. They were just telling the system something about themselves. People can do this sometimes to initiate a conversation and make a connection.

Now to give it more of a human touch you might want the chat bot to acknowledge the statement before giving the answer.

First let’s understand what is a question. The question mark is the most obvious, but is not always the case. For this example, we will use some sample phrases that denote a possible question, and put them in an entity called @Question.

Screen Shot 2018-08-08 at 5.55.08 PM

You may be wondering is that I haven’t met every criteria to determine if it’s a question. Well you can add your own. 😉  But really worst case scenario, you should err on the side of answering an utterance as a question.

You can also improve question detection by building contextual entities from your intents.

Once you have this done, you can start on the dialog.


(1) Create a folder that looks for the absence of a question entity using !@Question.

(2) For statements you are interested in, you can just look for the same intent. We acknowledge the statement, then continue on to find the answer.

Important! Always be data driven. What I mean is don’t just create statements for every single intent. Just do the ones that are exhibited by your end users, or have a meaning.

(3) It’s not possible to jump to a folder. So the first instinct is to jump to the first dialog node in the folder. This can actually cause problems if you add new intents at the top of the tree. So this dummy node is to force the flow into the next folder naturally.

When we run it this time we get the following:



… And there you have it. Very simple, but will add a more natural feeling. It can also make the user surprised (in a good way) when a chat bot goes slightly off script.

I’ve included a sample workspace you can play with.

Negation Annotation

So another tricky (and often a pain) with intelligent chat bots is the detection of negation.  For example:

Please remove all arugula from my prosciutto Pizza

Knowing what is not wanted in that question is normally quite hard. Contextual entities to the rescue again!

Somewhat different to the previous example, you not only need to train it the toppings but also what are not toppings. So we start off by creating a toppings entity.


We now export that entity, change the CSV File so the entity name is @notoppings, then import it back in.


Next we create our intent #Order_Pizza and annotate what is and isn’t a topping. The reason for this is to prevent it trying to guess a topping that isn’t annotated.


So let’s test our question from earlier. You will notice that I did not add the mentioned ingredients. Nor did I have an example matching how the request is structured.


Pretty cool! 🙂

Although this worked quite well, I could see you are likely to require a couple of similar negation examples so that the contextual entities can train better. I wouldn’t say it is much work, but it is probably something you need to test a bit more to ensure you don’t have edge cases.


Annotate it

So this was an interesting problem that was posed to me. Take the following intent below.


This intent will try to detect where someone is asking to select results by criteria. Next up let’s create the entities based on the intents. I will be using the original method of creating entities. You end up with this.


So let’s test this out…


Oh dear! It is seeing “it” as “IT Department”. This is not good.

Thankfully Watson Assistant just recently got Contextual entities. The new engine is able to understand the nature of what the entity really means, as long as you annotate it.

So going into the intent again, I have selected each word and marked it up like so:


Now let’s test it again.


Now it understands that it is not the IT department. Let’s try again.



It not only worked, but it created a new entity on the fly.

So once you teach it the patterns, it will capture the entities for you. This is currently on by default, but you should be able to toggle soon.

You still have to train it the different patterns you see. For example with the work I have done so far “Filter sales by marketing” will pick up marketing and sales. You would have to build an annotation to show what is the important term in that sentence.

Finally proper intelligence on your entities to augment your intents.

… Edit …
So someone asked what about “IT” as a department? That works too.


Visualising Intents

I’ve always used Pandas for getting an overview of intents, but when you are dealing at the enterprise level ( > 300 intents ), it can be a case of not being able to see the wood for the trees.

Recently I saw a nice mind map visualising intent structures (shout out to Rahul! 🙂 ). It was a manual process and a lot of work put into it.

So I looked to see if we can automate this. XMind to the rescue! There is a Python library that allows you to create through code.

First I start by setting up. You can get the ctx and workspace details from your assistant.

import xmind
from xmind.core import workbook, saver
from xmind.core.markerref import MarkerId
from xmind.core.topic import TopicElement
from watson_developer_cloud import ConversationV1
from urllib.parse import urlparse, parse_qs
import pandas as pd
import os

ctx = {
    "url": "",
    "username": "USERNAME",
    "password": "PASSWORD"

version = '2018-07-10'
workspace = 'WORKSPACE'

xmind_file = 'intents.xmind'

The XMind library will create a file if it doesn’t exist. But if the file already exists, then it adds to it. So we need to delete it before we continue.

if os.path.exists(xmind_file): os.remove(xmind_file)

This next piece of code allows you to capture all the intents directly from the workspace. In a large scale workspace, you will generally have pages of intents, so this handles that.

wa = ConversationV1( username=ctx.get('username'), password=ctx.get('password'), version=version, url=ctx.get('url'))

j = []
x = { 'pagination': 'DUMMY' }
cursor = None
while 'pagination' in x:
    x = wa.list_intents(workspace_id=workspace, export=True,cursor=cursor)
    if 'pagination' in x and 'next_cursor' in x['pagination']:
        cursor = x['pagination']['next_cursor']
        x = {}

recs = []
for i in j: 
    for k in i: 
        record = { 
            'intent': k['intent'],
            'total': len(k['examples'])

df = pd.DataFrame(recs,columns=['intent','total'])
df = df.sort_values(by=['intent'])

This last piece of code takes the dataframe created with the question and intent, then turns it into a MindMap. Each node will display the intent name and how many examples in that intent. For intents >20 it will have a green star, while <10 will have a red star.

I am also using the first word before the underscore as the category.

x = xmind.load(xmind_file)

sheet = x.getPrimarySheet()
sheet.setTitle('Intents Summary')

root = sheet.getRootTopic()

current_id = None
for index, row in df.iterrows():
    id = row['intent'].split('_')[0]
    intent = '{} ({})'.format(row['intent'].replace('{}_'.format(id),''),row['total'])

    if id != current_id:
        topic = root.addSubTopic()
        current_id = id

    item = topic.addSubTopic()

    if row['total'] > 20:
    elif row['total'] < 10:
        item.addMarker(MarkerId.starRed), xmind_file)
print('All done!')

Using the catalog intents as an example (and intentionally modifying/removing some) you end up with something like this:

Screen Shot 2018-07-22 at 22.51.46

You can build a more complex one with the examples as well, but when you are dealing with 1000’s of questions, it gets a little unwieldy.