Converting flowcharts to Actions

I’ve had to present this so often that I finally bit the bullet and created a generic how-to video.

In this video I show how you can quickly create an action from a flowchart. With the following points.

  • Showing a simple easy to follow flowchart structure.
  • The process of chunking which allows you to easily see the steps.
  • When to use conditions versus fall through to steps.
  • How to present the correct information across multiple steps without having to use session variables.

As always, I have supplied the sample files to the video. The flowchart software is Draw.io.

Enjoy!

Building an Action handler

Revisiting the earlier post on non-blocking options. One of the issues that comes up is that it can be messy to have multiple actions for a single function. The other issue is that user inputs can accidentally trigger such survey events. So you want to prevent that.

So the first part is simple. You just move them into one single action. You can even have the sub-action be the handler for the responses.

Within the json the label is what is shown to the end user. The text piece is what gets sent silently to Assistant. Swapping out the thumbs for UUIDs allows you to create training data that can never be triggered by the end user. To make it easier I put a word tag on the end so I remember which is which like so:

Now in the training data of your sub-action you add the UUIDs you created.

Lastly you want to have the steps to look for these in your first steps. Those steps should have a condition statement using the expression editor like so:

Once captured, ensure you finish on that step. Lastly you want to switch off “Ask clarifying question” and “Change conversation topic”.

You now have a poor mans callback. 🙂 I’ve attached the sample skill for you to review.

Time for an update

I keep meaning to get back to this blog but various things continue to block. So it’s 4am and I’ve nothing to do, so time to put in some updates.

Blockers

The main reasons to why I am so quiet.

  • I am much closer to the magic of how watsonx Assistant is made. So it’s harder to find stuff to blog about that I know will end up as a feature later (case in point, my earlier blog post).
  • My current work carries more potential commercial value, adding another layer of complexity to what I can publicly share.
  • The latest version of watsonx Assistant is designed for non-technical people to do much cooler stuff easier than before. Trying to find anything useful rarely warrants a whole blog post. So I’ve been posting to Stackoverflow as needed.

Highlights

Earlier in the year I received the IBM Tech 2023 award. It’s awarded to the top talent in IBM. I got a chance to go back to Dubai and meet many exceptional people in IBM across the world. That event really puts in perspective of how many moving parts there are in IBM and how there are numerous people excelling in those parts.

More recently I was humbled to receive the Culture Catalyst Award for my work on watsonx.ai and watsonx Assistant. Helping others to build solutions that matter. The award is given to culture leaders and role models who exemplify our purpose, values, and growth behaviours, working to activate our culture and drive growth.

I’ve also helped build a worldwide program focused on helping existing customers fix gaps in their Assistant and migrate easier to the latest version. Helping to oversee numerous experts in different Geos execute this for customers has been very rewarding.

On the personal side I got to visit Japan for my son. A fantastic country, but I was more impressed with how fluently and relaxed my son was there. His self study of Japanese and culture paid off for his first visit.

What now?

I’ll try to add updates as I can. If there is something in particular you want let me know (or just post on Stackoverflow or TechExchange. I can also be found on Bluesky.

Debugging your extension

When working with extensions in Watson Assistant just using the standard UI can be cumbersome to do deep dive analysis on why your extension is not working as expected.

You can use the browsers inspector to look at what is sent and received. You do this by going to the network tab, select “Response” then filter by “callout”. Once you get to the line where callout is mentioned remove the filter and then you can see all the parts.

For the video demo below I created a sample extension that pulls jokes from “I Can Haz Dad Joke” via their API. The sample extension is attached.

Creating a Quantum Computer Chatbot.

I normally do up these small and quick projects to help practice technologies I work with. Also to keep me a bit sane.

For this fun little project I thought about creating a chatbot that can translate a simple conversation into a format that can be understood by a quantum computer.

The plan is to build a Grover Algorithm circuit that will determine the best combination of people who like/dislike each other.

The architecture is as follows:

Breaking down each component.

  • IPad App (Swift): Why? Because Javascript annoys me. 🙂 Actually creating apps is very easy and swift is a lovely language. If you haven’t coded in it and want to, I recommend the App Brewery training.
  • Orchestration Layer (Python/Flask): My focus was on speed and python has all the modules to easily interact with everything else. Perfect for backend demo.
  • Watson Assistant: This is to handle the human interaction. Also to pull out the logical components and actors mentioned in the conversation.
  • Equation Generator: When the user asks to solve the problem, it translates the Watson Assistant results to an equation that Qiskit can run on.
  • Quantum Engine: This is just a helper class I created to build and run the quantum circuit, and then hand the results off to the reporting NLP. Of course what comes back is all 1’s and 0’s.
  • Reporting NLP: This takes the result of the quantum computer and converts it into meaningful report to the human. This is then handed back to the iPad App to render.

All this was built and running in a day. It’s not because I’m awesome 😉 but that the technology has moved forward so much that much of the heavy lifting is handled for you.

I’m not going to release the code (if you want some code, why not try pong? I wrote over the weekend). I will go over some of the annoyances that might help others. But first a demo.

This is a live demo. Nothing is simulated.

Watson Assistant

This was the easiest and most trivial to set up. Just three intents and one entity. The intents detect if two people should be considered friendly or unfriendly. The names of the two people is picked up by the entities. The last intent just triggers the solve process.

Equation Generator

This is a lot less exciting than it sounds. When sending a formula to qiskit it needs to take it in a format like so:

((A ^ B) & (C & D) & ~(D & A))

Which is something like “Bob hates Jane, Mike likes Anna, Mike and Bob don’t get on” in normal human speech.

Each single letter has to equate to each person mentioned. So those have to be kept track of as well as the relationships to build this.

Quantum Computing

So Qiskit literally holds your hand for most of this. It’s a fun API. If you want to start off learning Quantum Computing I strongly recommend “Learn Quantum Computing with Python and IBM Quantum Experience“. It focuses from a developer perspective, making it easier to start working through the math later.

To show how simple it is, Qiskit has a helper class called an Oracle. This is literally all the code to build and run the circuit.

# example expression
expression = '((A ^ B) & (C & D) & ~(D & C))'

oracle = LogicalExpressionOracle(expression)
quantum_circuit = oracle.construct_circuit()

quantum_instance = QuantumInstance(BasicAer.get_backend(quantum_computer), shots=2048)

grover = Grover(oracle)
result = grover.run(quantum_instance)

What you get back is mostly 1’s and 0’s. You can also generate graphs from the helper class, but they tend to be more for the Quantum Engineer.

Reporting

I used the report generated by Qiskit. But as the results are all 0/1 and backwards. I translate them out to the ABCD… and then added a legend to the report. That was all straight forward.

The tricky bit came in sending the image back to the iPad app. To do this I converted the image to base64 like so (Using OpenCV):

def imageToBase64(img):
    b = base64.b64encode(cv2.imencode('.png', img)[1]).decode()
    return f'{b}'

On the Swift side of things you can convert the base64 string back to an image like so.

func base64ToImage(_ base64Text: String) -> UIImage {
    let imageData = Data(base64Encoded: base64Text)
    let image = UIImage(data: imageData!)
    return image!
}

Getting it to render the image in a UIViewTable was messy. I ended up creating a custom UIViewTableCell. This also allowed to make it feel more chat-botty.

When I get around to cleaning up the code I’ll release and link here.

In closing…

While this was a fun distraction, it’s unlikely to be anything beyond a simple demo. There already exists complex decision optimization that can handle human interaction quite well.

But the field of quantum computing is changing rapidly. So it’s good to get on it early. 🙂

Rendering Intents in a 3D network graph

It’s a been a while…

I know some people were asking how to build a network graph of Intents/Questions. Personally I’ve never seen it at all useful, but I am bored, so I have created some sample code to do this.

The code will convert a CSV intents file to a pandas dataframe, then convert that to networkx graph format. Of course large graphs can be very messy like so.

That’s just 10 intents!

So I then converted this to K3D format so that you can view the same network in 3D, like so:

Hopefully someone finds it useful. 🙂

Multi-Lingual chat bot with cloud functions.

Bonus post for being away for so long! 🙂 Let’s talk about how to do a multi-lingual Chatbot (MLC).

Each skill is trained on its own individual language. You can mix languages into a single skill, however depending on the language selected the other is treated as either keywords or default language words. This can be handy if only certain language words are commonly used across languages.

For languages like say Korean or Arabic, it gives a limited ability versus say English. For something like English with Spanish, it simply does not work.

There are a number of options to work around this.

Landing point solution

This is where the entry point into your chatbot defines the language to return. By far the easiest solution. In this case you would have for example an English + Spanish website. If the user enters the Spanish website, they get the Spanish skill selected. Likewise with English.

Slightly up from this is having the user select the language at the start of the bot executing. This can help where the anonymous user is forced to one language website, but can’t easily find how to switch.

The downside for this solution is where end users mix language. Somewhat common in certain languages. For example Arabic. They may type in the other language only to get a confused response from the bot.

Preparation work

To show the demo, I first need to create two skills. One in English and one in Spanish. I select the same intents from the catalog to save time.

I also need to create Dialog nodes.. but that is so slow to do by hand! 🙁 No problem. I created a small python script to read the intents and write my dialog nodes for me like so:

Here is the sample script for this demo. Totally unsupported, and unlikely to work with large number of intent. But should get you started if you want to make your own.

Cloud functions to the rescue!

With the two workspaces created for each language we can now create a cloud function to handle the switching. This post won’t go into details on creating a cloud function. I recommend the built in tutorials.

First in the welcome node we will add the following fields.

Field Name Value Details
$host “us-south.functions.cloud.ibm.com” Set to where you created your cloud function
$action “workspaceLanguageSwitch” This is the name of the action we will create.
$language “es” The language of the other skill.
$credentials {“user”:”USERNAME”,”password”:”PASSWORD”} The cloud function username + password.
$namespace “ORG_SPACE/actions” The name of your ORG and SPACE.
$workspace_id “…” The workspace ID of the other skill.

Next we create a node directly after the welcome one with a conditional of “!$language_call” (more on that later). We also add action code as follows.

The action code allows us to call to the cloud function that we will create.

The child nodes of this node will either skip if no message comes back, or display the results of the cloud function.

On to the cloud function. We give it the name “workspaceLanguageSwitch”.

This cloud function does the following.

  • Checks that a question was sent in. If not it sends a “” message back.
  • Checks that the language of the question is set to what was requested. For example: In the English skill we check for Spanish (es).
  • If the language is matched, then it sends the question as-is to the other workspace specified. It also sets “$language_call” to true to prevent a loop.
  • Returns the result with confidence and intent.

Once this is created, we can test in the “Try it out” screen for both Spanish and English.

Here is all the sample code to try and recreate yourself.

It’s not all a bed of roses.

This is just a demo. There are a number of issues you need to be aware of you take this route.

  • The demo does a one time shot. So it will only work with Q&A responses. If you plan to use slots or process flows, then you need to add logic to store and maintain the context.
  • Your “Try it out” is free to use. Now that you have a cloud function it will cost money to run tests (all be it very cheap).
  • There is no fault control in the cloud function. So all that would need to be added.
  • Cloud functions must complete execution within 5 seconds. So if your other skill calls out to integration, it can potentially break the call.

What did you say?

A question recently asked was “How can I get Watson Conversation to repeat what I last asked it”? There are a couple of approaches to solve this, and I’d thought I would blog about them. First here is an example of what we are trying to achieve. 

One thing to understand when going forward. Everything you build should be data driven. So while there are valid use cases where this is needed, it doesn’t mean it is needed for every solution you build, unless evidence exists otherwise. 

Approach 1. Context Variable.

In this example we create a context variable at every node where we want the system to respond, like so: 

This works but prevents easily creating variations of the response. On the plus side you can give normal responses, but when the user asks to repeat, it can give a fixed custom response. 

Approach 2. Context variable everything!

Similar to the last approach except rather than creating the context variable in the context area, you build on the fly. So something like so:

This allows you to have custom responses. A disadvantage (all be it minor) is you are increasing the chance of a mistake in your code happening. Each response is adding 4 bytes to your overall skill/workspace size. This means nothing for small workspaces, but when you are enterprise level you need to be careful. 

I’ve attached a sample of above

Approach 3. Application layer.

With this method your application layer keeps a previous snapshot of the answer. Then when a repeat intent is detected, you just return the saved answer. 

I’ve seen some crazy stuff of resending question, modifying node maps, but really this is the simple option.