Negation Annotation

So another tricky (and often a pain) with intelligent chat bots is the detection of negation.  For example:

Please remove all arugula from my prosciutto Pizza

Knowing what is not wanted in that question is normally quite hard. Contextual entities to the rescue again!

Somewhat different to the previous example, you not only need to train it the toppings but also what are not toppings. So we start off by creating a toppings entity.

toppings_2407.png

We now export that entity, change the CSV File so the entity name is @notoppings, then import it back in.

entities_2407.png

Next we create our intent #Order_Pizza and annotate what is and isn’t a topping. The reason for this is to prevent it trying to guess a topping that isn’t annotated.

intent_2407

So let’s test our question from earlier. You will notice that I did not add the mentioned ingredients. Nor did I have an example matching how the request is structured.

test_2407a

Pretty cool! 🙂

Although this worked quite well, I could see you are likely to require a couple of similar negation examples so that the contextual entities can train better. I wouldn’t say it is much work, but it is probably something you need to test a bit more to ensure you don’t have edge cases.

 

Annotate it

So this was an interesting problem that was posed to me. Take the following intent below.

intentlist_2307.png

This intent will try to detect where someone is asking to select results by criteria. Next up let’s create the entities based on the intents. I will be using the original method of creating entities. You end up with this.

entities_2307.png

So let’s test this out…

test1_2307.png

Oh dear! It is seeing “it” as “IT Department”. This is not good.

Thankfully Watson Assistant just recently got Contextual entities. The new engine is able to understand the nature of what the entity really means, as long as you annotate it.

So going into the intent again, I have selected each word and marked it up like so:

annotate_2307

Now let’s test it again.

test2_2307.png

Now it understands that it is not the IT department. Let’s try again.

test3_2307.png

Woah!

It not only worked, but it created a new entity on the fly.

So once you teach it the patterns, it will capture the entities for you. This is currently on by default, but you should be able to toggle soon.

You still have to train it the different patterns you see. For example with the work I have done so far “Filter sales by marketing” will pick up marketing and sales. You would have to build an annotation to show what is the important term in that sentence.

Finally proper intelligence on your entities to augment your intents.

… Edit …
So someone asked what about “IT” as a department? That works too.

samplerun

Visualising Intents

I’ve always used Pandas for getting an overview of intents, but when you are dealing at the enterprise level ( > 300 intents ), it can be a case of not being able to see the wood for the trees.

Recently I saw a nice mind map visualising intent structures (shout out to Rahul! 🙂 ). It was a manual process and a lot of work put into it.

So I looked to see if we can automate this. XMind to the rescue! There is a Python library that allows you to create through code.

First I start by setting up. You can get the ctx and workspace details from your assistant.

import xmind
from xmind.core import workbook, saver
from xmind.core.markerref import MarkerId
from xmind.core.topic import TopicElement
from watson_developer_cloud import ConversationV1
from urllib.parse import urlparse, parse_qs
import pandas as pd
import os

ctx = {
    "url": "https://gateway-fra.watsonplatform.net/assistant/api",
    "username": "USERNAME",
    "password": "PASSWORD"
}

version = '2018-07-10'
workspace = 'WORKSPACE'

xmind_file = 'intents.xmind'

The XMind library will create a file if it doesn’t exist. But if the file already exists, then it adds to it. So we need to delete it before we continue.

if os.path.exists(xmind_file): os.remove(xmind_file)

This next piece of code allows you to capture all the intents directly from the workspace. In a large scale workspace, you will generally have pages of intents, so this handles that.

wa = ConversationV1( username=ctx.get('username'), password=ctx.get('password'), version=version, url=ctx.get('url'))

j = []
x = { 'pagination': 'DUMMY' }
cursor = None
while 'pagination' in x:
    x = wa.list_intents(workspace_id=workspace, export=True,cursor=cursor)
    j.append(x['intents'])
    if 'pagination' in x and 'next_cursor' in x['pagination']:
        cursor = x['pagination']['next_cursor']
    else:
        x = {}

recs = []
for i in j: 
    for k in i: 
        record = { 
            'intent': k['intent'],
            'total': len(k['examples'])
        }
        recs.append(record)

df = pd.DataFrame(recs,columns=['intent','total'])
df = df.sort_values(by=['intent'])

This last piece of code takes the dataframe created with the question and intent, then turns it into a MindMap. Each node will display the intent name and how many examples in that intent. For intents >20 it will have a green star, while <10 will have a red star.

I am also using the first word before the underscore as the category.

x = xmind.load(xmind_file)

sheet = x.getPrimarySheet()
sheet.setTitle('Intents Summary')

root = sheet.getRootTopic()
root.setTitle('Intents')

current_id = None
for index, row in df.iterrows():
    id = row['intent'].split('_')[0]
    intent = '{} ({})'.format(row['intent'].replace('{}_'.format(id),''),row['total'])

    if id != current_id:
        topic = root.addSubTopic()
        current_id = id
        topic.setTitle(id)

    item = topic.addSubTopic()
    item.setTitle(intent)

    if row['total'] > 20:
        item.addMarker(MarkerId.starGreen)
    elif row['total'] < 10:
        item.addMarker(MarkerId.starRed)

xmind.save(x, xmind_file)
print('All done!')

Using the catalog intents as an example (and intentionally modifying/removing some) you end up with something like this:

Screen Shot 2018-07-22 at 22.51.46

You can build a more complex one with the examples as well, but when you are dealing with 1000’s of questions, it gets a little unwieldy.

What is your name revisited.

As I mentioned in my previous post, Watson Assistant has a system entity called @sys-name, which allows you to capture a persons name. One issue with this is that it is not available for every language.

In the original post I mention using entity extraction. You can still do this, but the cloud functions feature makes this so much easier.

The instructions for doing this are very well documented, so I intentionally skip over bits. Please use this as a reference.

First I created a Cloud function Action with the following code:

import sys
from watson_developer_cloud import NaturalLanguageUnderstandingV1
from watson_developer_cloud.natural_language_understanding_v1 import Features, EntitiesOptions, KeywordsOptions

nlu = NaturalLanguageUnderstandingV1(
version='2017-02-27',
url='https://gateway-fra.watsonplatform.net/natural-language-understanding/api',
username='USERNAME',
password='PASSWORD')


def main(dict):
    rsp = nlu.analyze(text=dict['input'], features=Features(entities=EntitiesOptions()))

    username = ''
    company = ''

    for entity in rsp['entities']:
        if entity['type'] == 'Person':
            username = entity['text']
        elif entity['type'] == 'Company':
            company = entity['text']

    response = { 
        'name': username, 
        'company': company 
    }

    return response

On the parameters page I set up two parameters “input” and “language“. The language tag is to allow to use different languages where @sys-person may not exist.

On the end point page, you need to copy the API key and break into name:password as per the instructions link. Keep a note of it.

Now in Watson Assistant create a node that triggers and the following json code. Replace username/password with one from cloud function. Alternatively use the proper credentials formatting.

{
    "context": {
        "mycreds": {
            "user": "USERNAME",
            "password": "PASSWORD"
        },
        "nlu_response": ""
    },
    "output": {
        "text": {
            "values": [],
            "selection_policy": "sequential"
        }
    },
    "actions": [
     {
        "name": "simon_test_area/nlu_lookup",
        "type": "server",
        "parameters": {
        "input": "<? input.text ?>",
        "language": "en"
     },
    "credentials": "$mycreds",
    "result_variable": "$nlu_response"
    }
    ]
}

This will execute the cloud function and return the name and company (if they exist). Have this node skip to a child node which will execute the response. For my sample I have:

Name: $nlu_response.name<br>Company: $nlu_response.company

This is what you get back.

Screen Shot 2018-07-22 at 22.25.17

Very simple and very powerful. Combine this with Watson Knowledge Studio and you can build intelligence for your domain.