Using LoRA with Granite 3.3

Abstract geometric design featuring a hexagonal shape with layered colors in green, blue, and light blue.

IBM just announced Granite 3.3 with lots of new features!

I am going to focus on the LoRA adapter feature for this post and how to use it.

LoRA stands for “Low rank adaptation” which is a way to fine tune LLMs fast and cheap, while mitigating “Model collapse” that can happen with normal transfer learning. It does this by locking down the model except for a few parts that you can tweak.

Because of this you can actually train a model on your laptop and get decent results in a relatively short amount of time.

For testing this I used the granite3.3-8b-instruct model. My laptop spec is M4 Max Pro 128GB. If your Mac Mx laptop has less than 40GB of memory you would be better to use the granite3.3-2b-instruct model instead. For PC you will need CUDA.

I used the following steps.

Notebook 1:

  1. For my dataset file I took the watsonx Assistant manual. [888 Pages]
  2. Using docling I converted this to markdown.
  3. The markdown I converted into tokens and then chunked those tokens by size 512 with an overlap of 50. This was then saved to a JSON list file.

Total Time: 17 minutes.

Notebook 2:

Using that chunked list I trained a LoRA adapter and saved it.

Total Time: 1 hour 42 minutes

Notebook 3:

At this point it was ready to go. I created a third notebook to test the results using the base model and the model with the LoRA adapter.


You can download the full project from github if you want to reproduce.

The results below are as-is from the LLM (No RAG)

Results

Question

What is the “Trigger Words” feature in watsonx Assistant? [ref]

LoRA Answer

✅ The “Trigger Words” feature in watsonx Assistant is designed to detect and prevent user inputs from triggering sensitive actions or responses that might be harmful or inappropriate.

Base Model Answer

❌ The “Trigger Words” feature in watsonx Assistant allows users to specify certain words or phrases that, when present in a user’s input, will activate a predefined response or action. This enables the assistant to perform specific tasks or provide particular information in response to cues from the user’s input.

How do I stop my action being used as a clarifying question? [ref]

✅ You can stop an action from being used as a clarifying question in the following ways:

  1. If you want to stop all actions from being used as clarifying questions, go to Home -> Actions -> Settings -> Clarifying questions and switch the toggle to Off.
  1. If you want to stop a specific action from being used as a clarifying question, open the action and go to Action settings > Clarifying question and switch the toggle to Off.

❌ To prevent your action from being interpreted as a clarifying question, be clear and direct in your communication. State your intentions explicitly and avoid ambiguous language. If you’re providing information, present it as a statement rather than a query. For instance, instead of saying “Isn’t it true that…?”, say “The fact is that…”.

What are the response types available to use? [ref]

✅ The response types available to use are text, image, video, audio, iframe, and options. (note: correct in relation to UI)

❌ The response types available to use are:

  • Plain text
  • Numbered or bulleted lists
  • Yes/No responses
  • Short phrases or words

Note!

The results are somewhat cherry picked. The model can still fail to give a correct answer without more context. But in conjunction with RAG, your answers are less likely to hallucinate as it understands your domain language/topic.

The Conversation Changed

I’ve been building conversational systems for the better part of a decade. Watson Conversation, Watson Assistant, watsonx — I watched the naming conventions change almost as often as the underlying capabilities did. Through all of that, the core problem stayed the same: get a user from a question to an answer with as little friction as possible.

I was good at it. I understood intent classification inside and out. I could debug confidence scores in my sleep. I knew how to structure dialog trees that didn’t make users want to throw their laptop out a window. I’d built tooling, written about edge cases like compound questions and hex conversion tricks, and spent real time thinking about how to make these systems work for the people actually using them.

But somewhere in the last year or so, I started noticing that the problems I was most interested in weren’t really about conversation anymore.

The shift didn’t happen overnight. It started with the retrieval-augmented generation wave — suddenly the “knowledge” part of the system mattered as much as the conversational flow. Then tool use started getting serious. Models that could not just respond but act. Call an API. Query a database. Make a decision about what to do next based on context, not just what slot needed filling.

That’s when I realised I wasn’t thinking about chatbots anymore. I was thinking about agents.

The architecture problems are genuinely different. Orchestration, memory, planning, guardrails, human-in-the-loop design — these aren’t extensions of conversational AI. They’re a different discipline. One that borrows from it, sure, but the mental model is closer to distributed systems than dialog management.

I’ve been working in this space for a while now, quietly. Designing agentic architectures, thinking about how enterprises actually deploy these things without everything falling over. Solutions architecture for systems where the LLM isn’t the product — it’s a component in something larger. The interesting problems are in the wiring: how agents hand off to each other, how you maintain state across long-running workflows, how you build trust in systems that make autonomous decisions.

This blog has always been called “Talk to me,” and I’m not changing that. But the conversation has changed. The things I’ll be writing about going forward reflect where I actually spend my time — agentic design patterns, orchestration strategies, the real-world messiness of putting autonomous systems into production.

The Watson years gave me a foundation I still lean on every day. Understanding user intent, designing for failure, thinking about the human on the other end. That doesn’t go away just because the systems got more capable. If anything, it matters more now.

So consider this the pivot point. Everything before this was conversational AI. Everything after is what happens when the conversation starts doing things on its own.

The Acceleration

This is the ninth in a series of posts about how I ended up where I am today.

Coming back from Dubai in the middle of a pandemic and stepping into a worldwide role as a Senior Solutions AI Architect felt like starting again in some ways. The energy of the Lab, the face-to-face work with government teams, the physical presence in a place that was trying to build something ambitious. All of that was gone. Now it was video calls and global time zones.

The focus of the role was building first-of-a-kind solutions for customers, and later enterprise-scale deployments using the watsonx portfolio. But the thing that defined those years more than any single project was the speed at which everything underneath kept changing.

Classic ML to Generative AI to Agentic

When I started in the worldwide role, the work was still grounded in what you’d call classic machine learning. Models trained for specific tasks, carefully tuned, deployed with guardrails that were well understood. The patterns were established. You knew what worked and what didn’t.

Then generative AI arrived and rewrote the playbook. Suddenly the models weren’t just classifying or predicting. They were creating. The conversations with customers shifted. The architecture patterns shifted. The expectations shifted. Things that had been theoretical became possible almost overnight, and the challenge moved from “can we do this?” to “should we do this, and if so, how do we do it responsibly at scale?”

And then the agentic wave started building. Models that don’t just generate but act. Systems that plan, use tools, make decisions, hand off to other systems. The architecture problems got more interesting and more consequential at the same time.

Through all of this I had to keep up. Not just with the technology itself, but with how it changed what customers needed, what solutions looked like, and what “good” meant in a world where the goalposts moved every few months.

What a Team of Weeks Became a Day for Two

The acceleration wasn’t abstract. You could feel it in the work. Something that would have taken a team weeks to build and deploy a couple of years ago could now be done in days by one or two people. The tooling got better, the models got more capable, the patterns got more reusable. Every cycle compressed the one before it.

That compression changed what a Solutions Architect actually does. The job stopped being about knowing all the answers and became about knowing which questions to ask, how to evaluate what’s possible now versus what will be possible in six months, and how to design systems that won’t collapse when the technology underneath them takes another leap.

Looking at the Thread

Across all of this I’ve worked with hundreds of people from different countries, with wildly varying skills, but all with a passion for what they do. Some of them remind me of where I was at the start of my journey. I try to help those people become better than me. Others are people I wish I could emulate even half of what they do. They push me to be more than I am. That exchange, that lifting each other up, has been the constant through every role and every country.

The continual learning is something I love. The technology never stands still and neither can I. Even if AI eventually does everything for us, I hope I never stop wanting to understand how it works and what it means.

Looking back across everything, from the pixel map of Ireland to the German laser printer to automating Lotus localisation to routing support tickets with NLP to building chatbots in Dubai to designing worldwide AI solutions, there’s a thread. I’ve always been most interested in the space between what technology can do and what people actually need it to do. The gap. The wiring. The part where you take something powerful and make it useful.

That’s what drew me to agentic systems. The technology is more powerful than anything I’ve worked with. But the problems are the same ones I’ve been solving my whole career. How do you build something that works for the person on the other end? How do you design for the things that will go wrong? How do you make sure the human stays in the picture?

The tools have changed. The question hasn’t.


This is the ninth in a series of posts about how I ended up where I am today. Next: the pivot, and what happens when the conversation starts doing things on its own.

Enjoy This Time While You Can

This is the eighth in a series of posts about how I ended up where I am today.

The learning system I’d built caught the attention of the Watson team. That, combined with not wanting to spend the rest of my career supporting an email product, got me a role in the Watson group as delivery. My job was to determine what customers needed and deliver an AI solution to meet it.

Watson in the early days felt like a startup inside IBM. The technology was changing day to day. Approaches would pivot without warning. The UIs for figuring out why the AI was misbehaving were hostile at best. Nothing was settled, nothing was polished, and the whole team was high-performing in a way that made you want to keep up. There was something about the energy of it that made me want to know more.

The learning curve was brutal. Months of it. It got to the point where I genuinely thought it would never end. Every time I felt like I had a handle on something, the ground shifted again. I raised this with my manager.

His response was, “Enjoy this time while you can.”

He wasn’t wrong.

Every Couple of Days, a Different Country

Once the projects started coming in, they came from everywhere. Different geographies, different industries, all at once. It was intense. Every couple of days I was in a different country. Europe, Australia, the Middle East. You’d land, get context on the customer’s problem, figure out what Watson could actually do for them, and work with the local teams to deliver it. The local teams were brilliant in their own right. My job was to help them become self-sufficient so the solution didn’t leave when I did.

The pace was relentless but I loved it. Every project taught me something new about how AI met the real world. The gap between what the technology could do in a demo and what it could do in a customer’s environment was where all the interesting problems lived.

Dubai

My second-to-last project on that team changed everything. I was assigned as Technical Lead to build an AI chatbot for the Department of Economic Development in Dubai. Its purpose was to help people start a business. Walk them through the process, answer their questions, point them to the right services.

The technology was still in its infancy. I was given a team that had limited to no knowledge of AI, and newer still were the customer’s team who would eventually take over and run it.

We made it a success. But the part I’m proudest of is what happened to the people. Everyone who worked on my team grew their careers from it. They learned valuable skills, built networks, formed relationships with people that mattered in and out of IBM. Some of them went on to do things I couldn’t have predicted when we started.

The project impressed the government and the local department enough that I got asked to help lead an AI Lab in Dubai. A joint venture with the Dubai government to accelerate the use of AI across different departments.

That was not a role I could say no to.

Building an AI City

I moved to Dubai. A small footnote that turned out to matter: if I hadn’t sat the City and Guilds in C and Unix all those years ago at the printing company, I wouldn’t have been allowed to work in the country. Dubai is strict on having university-level or equivalent qualifications. That two-year course I took out of frustration ended up being the piece of paper that got me through the door.

The education question followed me there too. In one meeting, a presenter asked everyone with a PhD to raise their hand, then said only those people would understand what came next. What followed was a walkthrough of machine learning principles I already knew. There was another “why are you here?” moment when I explained my background. But as before, the work spoke for itself.

I don’t have a hatred for certifications or titles. I’ve helped people get their PhDs. I’ve read people’s published papers to understand what they’re capable of. I judge people on their merit, the same way I’d want them to judge me. But someone hiding behind a title, using it as a wall instead of just explaining their viewpoint and evidence, that does more damage than they realise.

The AI Lab’s mission was to work with government departments to identify where AI could make a difference, score those use cases to see if the technology could actually deliver, and then build the ones that had legs.

We met with departments across the government. Over 300 use cases came out of those conversations, spanning more than ten departments. Around thirty of those were built out as proof of concepts. A handful made it to production. On top of the build work, we were enabling government teams in AI technologies so they could carry things forward themselves.

It was an incredible time. Building an AI city wasn’t just a government directive. It was something everyone was working towards. I got to work with brilliant people and technology I’d never touched before. Robots, drones, IoT, things that made the conversational AI work feel like one small piece of something much larger.

During my time there I helped three people work towards their PhDs and helped another build a package to prove they qualified as an experienced data scientist. That last one carries some irony. The same qualification I helped them put together was one I couldn’t get myself, because my education level didn’t meet the entry requirements.

We’d started expanding to Abu Dhabi and across the UAE when Covid hit and changed the world. I moved back to Ireland and into a new role.

One thing I noticed before I left. In 2017, some of the use cases we’d scoped required the most powerful technology and models available just to get close to working. By 2021, many of those same use cases could be done by a student on a laptop. Technology was accelerating in a way that was hard to fully appreciate while you were inside it.


This is the eighth in a series of posts about how I ended up where I am today. Next: the worldwide role, the shift from classic ML to generative AI, and watching the acceleration up close.

Next Post | Previous Post

Building a Learning System

… This is the seventh in a series of posts about how I ended up where I am today.

One of the most hated jobs in Level 2 support is managing the queues. The department is made up of teams who specialise in certain products or areas. I worked in the Application Development support team, but tickets came in for everything.

While on the queue your job is to read every incoming support ticket and figure out where it should go. Which department should handle it. Whether Level 1 (if that’s where it came from) did their due diligence, and if not, follow up. Whether the customer is premium, because only certain people could handle those. Whether the customer has an ongoing relationship with a specific engineer. And then, out of whoever is left, who in that department is the best fit based on workload and specialities.

It was time-consuming, thankless work. When tickets got routed to the wrong group or the wrong person, there were arguments. One person on the team was exceptional at it, could read a ticket and know exactly where it needed to go almost instantly. But for the most part nobody wanted queue duty. People wanted to solve customer problems, not do paperwork.

“You Can’t Automate That”

During a lunchtime conversation about all of this with one of my colleagues, I said why can’t we just automate it. He said it wasn’t possible. You’d need to understand the technology behind each ticket, and simple keyword matching would never be good enough. Too much nuance.

So I wrote out a specification to prove him wrong.

I detailed how we could use NLP trained on existing support tickets to find the best route for incoming ones, combined with an algorithm that could pick the best engineer for the job based on workload, speciality, and past performance.

What I didn’t know when I handed it to him is that he sent it to senior management. They saw merit in what I’d designed and gave me time and a couple of people to help build it out.

93%

Using LanguageWare and the backlog of previous tickets, I created an application to build the NLP dictionary. The team helped test and curate the results. We ran it against real routing decisions to see how it compared.

Your average engineer doing queue duty was about 68% accurate in routing tickets to the right place. The one expert we had, the person everyone wished was on queue permanently, ran at 94%. The routing engine came in at 93%.

Couldn’t beat the expert. But it was close enough to show real promise, and it was consistent. It didn’t have bad days, didn’t get distracted, didn’t resent being on queue duty.

The Side Effects

One of the more interesting things that came out of the engine wasn’t in the original plan. When we took a support ticket and cross-referenced it against the support knowledge base, the system could surface the most relevant help documents. And when we pointed it at source code, it could identify the files most likely to contain the area where a fix should happen.

That was a genuine surprise. We’d built it to route tickets, and it turned out the underlying technology could do something much broader. It could connect a customer’s problem description to the actual knowledge and code that would solve it.

What Came Next

The project didn’t go further. What I didn’t know at the time was that IBM Research were already working on something called Watson, which would be considerably more powerful than anything I could build with LanguageWare and custom algorithms.

But the work wasn’t wasted. Building that system taught me how to think about the gap between what humans do well and what machines can take over. The expert running at 94% wasn’t doing keyword matching. They were drawing on years of context, relationships, and instinct. Getting a machine to 93% meant understanding what they were actually doing and finding a way to approximate it. That’s a problem I’d spend the next decade working on in different forms.

It also caught the attention of the people building Watson. And that’s how I ended up in the conversation that would define the next chapter of my career.


This is the seventh in a series of posts about how I ended up where I am today. Next: the pivot from conversational AI to agentic systems, and why everything before this was preparation for what came after.

Next Post | Previous Post

How I Became a Master Inventor

This is the sixth in a series of posts about how I ended up where I am today.

After a presentation on patents at IBM, I started thinking about what might be patentable. I had this fantastic idea. Spent a couple of days writing up how it would work, built a proof of concept, the whole thing. Went to one of the patent experts with real pride in what I’d done.

They turned around, tapped a couple of words into a search tool, and found a near-identical invention filed a year earlier.

I was crestfallen.

Over the next year and a half I made seven more submissions. Every single one failed to get anywhere close to being valid. Seven ideas I thought were original, seven dead ends.

Rather than keep going solo, I got a mentor. And that changed everything.

What the Mentor Taught Me

The first thing they fixed was my process. I’d been doing it backwards. Building out the whole thing, proof of concept and all, before checking whether the idea had legs. The mentor taught me to flip that around.

Write up the core pieces you believe are novel. Then search. You’re looking for two things: has someone already created this, and does it have value?

A direct hit on an existing patent doesn’t have to be the end. You look at how their idea is implemented and ask how yours improves on it.

One of my earlier rejections was around canary traps. Someone had already filed. So I looked at what they’d done and realised it would never work at enterprise scale. I resubmitted with a method that could handle hundreds of thousands of emails without performance issues.

The important thing is to not do the full work until you know you have something worth building.

The Disclosure Is a Sales Pitch

This was the second shift in thinking. When you write a disclosure, you’re not writing a patent. The lawyer writes the patent. You’re writing a sales pitch to show novelty and value.

Sure, you need an implementation. But if you can’t demonstrate novelty and value, it doesn’t matter how clever the technology is. You need to write in a way that, at minimum, someone in the field can clearly understand what sets your idea apart from everything else. If you can’t do that, you’ve lost your audience before they’ve finished reading.

How to Present

One method I saw from another inventor got their disclosures rated search-1 (the highest value rating) nearly every time. The whole thing took ten to fifteen minutes.

Start by talking about the industry or technology around your disclosure in plain language. Get the panel to understand the narrow area you’re focusing on. Then talk about the limitations of that area, or what’s missing. Then present what’s novel in your proposal.

By the time you reach the third part, they already understand the problem and why it matters. Your solution lands in context rather than in a vacuum. I found that technique useful well beyond patents.

Scaling It Up

Our department had an average of two or three patents a year with not many submissions. I’d gotten a couple by that point, and management asked me to help improve the department’s output.

It became a team effort. The people I worked with brought different strengths to the table. One could speak to upper management and get us the resources we needed, including mentors for the teams. Others were able to organise teams and inspire people to participate, even people who felt they had nothing to contribute. My core role was helping teams flesh out their ideas and present them in a way that showed novelty and value.

Within the first couple of months we had over forty disclosures. Well over half went to search, meaning IBM felt they were worth investing money to investigate further. Many of those went on to publish or file. Even after the initial push, a lot of the people on those teams continued submitting. Some went on to become Master Inventors themselves.

The Title

From that work I got invited to serve on IDT boards, the panels that help people deliver disclosures of value. That combined with meeting the minimum requirements for what a Master Inventor needs, and I finally got awarded the title.

It takes three to five years to achieve. I did it in just under four.

Even with the title, the education gap followed me. Without a PhD, I was often asked to prove my inventions by building them out, or to go beyond the requirements that others with academic credentials didn’t have to meet. It felt unfair at times. But in a strange way, it meant that everything I submitted had been tested. I wasn’t just making stuff up on paper. I’d built the thing.

After the title it turns out the reward is more work. The title is a three-year term that you have to renew with the same level of commitment. I got it renewed for my second term, but after I moved to my new role in Dubai I let it lapse to commit time fully to what I was doing there.

I don’t regret that. The patent work taught me how to think about problems clearly, how to communicate ideas to people who don’t share your context, and how to help other people see the value in what they already know. Those skills followed me into everything I did afterwards.


This is the sixth in a series of posts about how I ended up where I am today. Next: how I built a learning system that changed the way our team worked.

Next Post | Previous Post

How You Use Your Language

This is the fifth in a series of posts about how I ended up where I am today.

My visa was running out. I’d already planned to move back to Ireland before 9/11 happened, and when I returned I moved into a development role. On paper it was a step forward. In practice it was the lowest point of my time at IBM.

Up until then I’d always had some level of autonomy. I could see who used our products and why. I had relationships with developers across the world. I could solution things my own way.

Now I was building software to specification. Given instructions, told to follow them exactly. When I asked about the customers who’d be using what we were building, I was told not to worry about that. On top of that, I was competing with people who’d been working in that architecture for years before I arrived. The whole thing felt like going backwards.

Two things stand out from that period.

My educational background was being used against me again. A senior manager told me directly that I would never be promoted without a university qualification. At one point I spent time coaching a new starter on how to build a presentation. Techniques, structure, how to land the key points. They’d never done one before. After they presented, another manager turned to me and said, “Wow, that’s excellent. Why can’t you do that, Simon?”

I didn’t say anything.

One manager during that time did help me. He took the time to understand where I was and helped me see a way forward. I needed that more than I probably showed.

The Other Side of the Phone

In 2006 I moved to Level 2 support. There’s a misconception about customer support, even inside IBM, that it’s people reading from scripts. At no level is that true, but Level 2 is something else entirely. You’re working with high-value customers or customers who are furious. You regularly have to tell someone they can’t have what they want, because the fix or the feature is just too complex, and they have to come away from that conversation feeling heard and respected.

The people who worked there blew my mind.

In most places I’d been, asking for help was treated as a sign of weakness. Something to be avoided or held against you. In this team, the opposite was true. It was in everyone’s interest to make sure nobody stayed stuck. If your teammate could function without help, it meant your own work got done faster. The whole system only worked if people shared what they knew.

So you’d ask a question. They wouldn’t just give you the answer. They’d coach you to build habits so you could find it yourself. But if the problem went deep enough, they’d give you the answer and then hand you the whole theory behind it. The field of study. The context you’d need to never have to ask again.

There was one rule. You couldn’t ask the same question twice. That was the trade-off. And it was a fair one. It taught you to think about a problem before asking for help, to value someone’s time when they gave it to you. That expectation has stayed with me. To this day it frustrates me when people treat others as their own personal search engine rather than sitting with a problem for a while first.

Knowledge was never used to segregate or to make someone feel small. If you weren’t an expert in something, they made sure you became one. I’d never experienced anything like it. In every other place I’d worked, knowledge was hoarded. People held onto what they knew because it gave them power, made them harder to replace. Here, the opposite was true. Sharing what you knew made the whole team stronger, and that made you stronger too.

Pure Magic

Early on in that role I took a call from an extremely upset customer. Screaming. Threatening. I was in pure panic, completely rattled. I called my manager over.

I can’t do justice to what happened next by describing it in detail, because the skill was in the subtlety. But the customer went from screaming to talking normally, got their fix, and then apologised to me. All from how my manager spoke to them. The words he chose, the pace, the way he acknowledged what they were feeling without giving ground on what was possible.

It was pure magic.

I wanted to understand how he did it. They teach you some of the basics, but I wanted to go deeper. How do you take someone who is at their worst and bring them back? How do you tell someone “no” and have them thank you for it? This wasn’t a trick or a technique. It was a way of thinking about people and what they actually need in a moment of frustration.

Language as a Tool

Those methods became engrained in how people on the team dealt with each other too. Performance reviews weren’t “here’s how you screwed up last year.” They were “here’s how we’re going to make you better in the new year.” The difference sounds small when you write it down. In practice, it changed everything about how people showed up.

It’s about how you use your language. And how you see people’s potential.

Those skills didn’t stay at work. The way you speak to a frustrated customer is the same way you speak to anyone who feels unheard. It saved relationships with customers, yes. But also with family. With friends. I just wish more people understood how much the words you choose and the intent behind them actually matter.

I moved into a team lead role eventually, and two major projects during this time helped shape the next stage of my career. How I became a Master Inventor, and how I built a learning system. Both of those are stories for another time.

The work itself was support for application development, deeply technical stuff that I loved. Towards the end, the focus shifted to supporting an email product. I didn’t want to do that. But my work on the learning system had caught the attention of the Watson team, and that opened a door I hadn’t expected.

Which is where this story picks up in the next post.


This is the fifth in a series of posts about how I ended up where I am today. Next: the pivot from conversational AI to agentic systems, and why everything before this was preparation for what came after.

Next Post | Previous Post

Why Are You Here?

This is the fourth in a series of posts about how I ended up where I am today.

Returning to Ireland, unemployed, I did what anyone without connections does. I mass-mailed my CV to every company in the Yellow Pages. I walked to industrial estates and handed it in at reception desks. Did that for about a month before I got the call from IBM.

Lotus had just been taken over. The interview was unlike anything I’d experienced before. Nobody asked about qualifications. They wanted to know about the technology I’d worked on, the approaches I’d taken, why I’d chosen one way over another. The whole thing felt like a conversation between people who were genuinely curious. It helped that I already knew their products from my Abbey Computer Training days. Lotus 1-2-3 finally paying off.

IBM offered me a part-time contract. I had another offer from a different company at the same time (I honestly can’t remember who), but I went with IBM. Partly because of the technology, but also because they had a wealth of training resources that you could access anytime you wanted. For someone who’d been self-teaching from books and manuals for years, that was like being sent back to the library.

Automating the Boring Parts

My first project was localising Lotus ScreenCam into all the major European languages. Smallest product on the shelf. Myself and one other person were responsible for the lot. The process was semi-manual and slow.

I found it boring, so I automated it. Built tooling to do my work for me, which freed up time to study and tinker with whatever else I could get my hands on. We finished well ahead of schedule.

They moved me onto bigger products after that. SmartSuite, cc:Mail. More of the same work at a larger scale. I got a trip to California to work briefly with the cc:Mail team, which was brilliant.

“How Could You Know That?”

At some point I went for an interview for a full-time permanent role within IBM. They knew my educational background. What they didn’t know was how much of a nerd I actually was.

The main question was to explain, in detail, the entire event chain of interacting with a UI element in Windows 3.1. At the code level. Every message, every callback, every step in the sequence.

I answered it perfectly.

Their response was, “How could you know that?”

I told them a read a book, pulled the system apart, run experiments. The same way I’d learned everything else. Little did I know that reaction would follow me around for years.

They didn’t offer me the permanent role straight away. My contract was running out and I had a full-time job lined up outside IBM. Then the person they’d actually wanted for the role left for another company. I was second in line. They offered me a contract extension.

I told them I had a permanent job waiting for me elsewhere. They changed the offer to a full-time permanent contract. I accepted.

Boston

Some time after this I got put forward for a project that would have meant moving to Singapore. Came second. But the managers involved had something else in mind and offered me a role on a project in Boston instead.

So I went to the US. Again.

The work was localising first-of-a-kind products that IBM was developing. Primarily eSuite, which was a product sadly ahead of its time. Browser-based office applications in the late ’90s. The world and technology wasn’t ready for it, but the engineering was genuinely impressive.

Within the first year I had the localisation work fully automated across multiple systems and operating systems. That freed me up to focus on the first-of-a-kind solutions the management team wanted to build out as products. This was closer to what I actually wanted to be doing.

The World Gets Bigger

My team in Boston was from everywhere. Japan, China, Korea, the US, Ireland. Working that closely with people from different cultures changed how I thought about almost everything. The nuances of communication, the different assumptions people carry, the things that matter in one culture and mean nothing in another. It opened me up in ways I hadn’t expected and was partly why I later visited China, Japan and Korea. I met my wife in Korea.

Of course, the education question followed me there too.

At one team meeting, everyone was asked to share their educational background. When I explained that I’d left school early and never went to university, one of the Japanese team members looked at me and said, “Why are you here?”

It didn’t stop at the meeting. It carried on afterwards until a Japanese exec stepped in and told them to give me the same respect as everyone else on the team.

I still look back on Boston as one of my fondest periods. It was where I started to realise that the world is a much bigger place than I’d understood, and that what seemed so important in Ireland was trivial compared to the work we were doing that reached people everywhere.

The question “why are you here?” was meant to diminish. But over time I started hearing it differently. Less as a challenge and more as something worth actually thinking about. I was there because I kept showing up, kept building, kept learning. The answer was in the work.


This is the fourth in a series of posts about how I ended up where I am today. Next: the low point, the unexpected education of customer support, and why how you use your language matters more than what you know.

Next Post | Previous Post

The Software I Created

This is the third in a series of posts about how I ended up where I am today.

The way I got the printing company job tells you something about how things worked back then. After Punch and Programming shut down, some of the customers they’d left stranded reached out to me directly. One of them was a printing company who needed software fixed that I’d originally built in my previous role. I went in, fixed it, and they showed me around.

The place was fascinating. They had everything from state-of-the-art equipment to machines older than I was. And then they showed me a book. Written by a priest, sometime in the 1920s if I remember right, it described in detail how a particular printing machine worked and how it could be constructed. They asked if the machine could be recreated as software.

I said it could. They loaned me the book to do just that.

The Priest’s Machine

My friend Ambrose Clarke and I built it. In fairness, Ambrose did the heavy lifting of recreating the machine’s logic in code. He was a brilliant programmer and that was a serious piece of work. My area was taking his output and converting it into PostScript, the language that professional printing machines understood. The two halves had to talk to each other perfectly or the output would be garbage.

We got paid for the work and at the time we thought it was an insane amount of money. Looking back, it was a pittance for what we delivered. But we didn’t know that yet.

They offered me a full-time job afterwards. I named a salary that felt ambitious to me. They accepted immediately, which should have been a clue. I was getting myself for almost nothing and they knew it.

Hochgeschwindigkeitslaserdrucker

The printing company had a lot of machinery, but two high-speed laser printers stand out.

The first was from the 1970s. You had to flip switches in sequence to boot it up. It stored printing data on magnetic tape. The manuals were in German, and every other sentence seemed to contain the word “Hochgeschwindigkeitslaserdrucker” repeated endlessly. Building anything printable on it was a slow, painful process of switch-flipping and tape-writing.

I figured out a way to get a PC to write the magnetic tapes to the exact specification the hardware expected. You could take a simple text file, run it through my software, and get WYSIWYG mail merge output on this 1970s beast. That one change made the machine genuinely useful to the business.

The second machine was more modern and could handle graphics. Handwriting-style output instead of just print. I built another application for that one too, same idea. Text and images in, professional print-ready output out.

I had a healthy respect for how two of the CEOs ran the business, Pierce and Jean. They were smart operators. But I was building core tools for the company and being paid as though I was an intern.

“You’re Not Qualified for the Job You Have”

When I eventually asked for a pay rise, they refused. Their reasoning was that I wasn’t qualified for the role I was already doing. That I was already doing it didn’t seem to factor in.

My younger self, instead of pushing back, enrolled in a two-year City and Guilds course in C programming and Unix. I passed with distinction. They doubled my salary.

I can’t stress enough that “doubled” still meant very little. The factory workers assumed I was rolling in money because “computer programmer” was a hot job title in the early ’90s. The truth is they were earning three or four times what I was.

But I still didn’t fully understand that. I was learning constantly, working on interesting problems, and assumed that was the deal. You do good work, you get to keep doing it. Nobody sat me down and explained what the market actually paid.

“Show Them the Software I Created”

The moment I can trace the turning point to happened during a customer visit. One of the CEOs was showing some visitors around and said, “Simon, can you please show these gentlemen the software I created.”

I was annoyed enough to say it out loud. “You created it, you can show them.” And I started to walk off.

The customers laughed. The CEO corrected himself and said I’d written it.

After they left he came back to talk to me. He wasn’t angry, and I don’t remember the full conversation. What stuck was one line: “Simon, those guys are in your area and get paid three times what you do, but don’t even know half of what you do.”

I think he meant it as a compliment. I didn’t take it that way.

California, Almost

Some time later, a US company that built software for printing hardware came to visit. They saw what I’d built and were blown away. A few weeks after they left, they offered me a job in California.

I said yes.

Getting a work visa for the US as a non-citizen is time-consuming and expensive for the sponsoring company. They went through the whole process anyway. Months of paperwork proving I had specialised knowledge they couldn’t source locally. Eventually everything was approved. Flights booked. Ready to go.

Then the US company went out of business.

One of the CEOs at my current company asked if I wanted to stay. But we weren’t leaving on good terms by that point. I’d already trained two replacements for my work and had no appetite to go backwards.

Since I had the flights booked and a good relationship with the people I’d met in the States, I went over for a month as a holiday instead. While I was there, another company expressed interest in hiring me. I went for the interview. It was surreal. They asked if I’d ever been to Mexico, then started explaining where I could hop the border to sort out my visa situation. I walked out of that one fairly quickly.

I came home to Ireland, unemployed again. The CEO’s comment should have been a wake-up call. The US offer should have told me something about what I was worth. But honestly, I didn’t connect those dots for a long time. I just knew I needed a job.


This is the third in a series of posts about how I ended up where I am today. Next: mass-mailing CVs to every company in the Yellow Pages, and the IBM interview that changed everything.

Next Post | Previous Post

Pixel by Pixel

This is the second in a series of posts about how I ended up where I am today.

After the Motorola interview fell through, I got a second work experience placement at a place called Abbey Computer Training. PCs were just kicking off and everyone wanted to learn word processors, spreadsheets, CPM, DOS. The spreadsheet of choice was Lotus 1-2-3, which would come back to help me later in ways I couldn’t have predicted.

Within a week I was fully up to speed on all the software they taught. By the second week I was pulling things apart to understand how they worked underneath. I was there a few months and genuinely enjoyed it, but it was a tiny operation. The owner and a receptionist. There was no job to offer me, and he was honest about that.

What it did give me was a foothold. Enough experience on a CV to get me through the next door.

The Science of Getting Data Into a Computer

That next door was a company called Punch and Programming. The name tells you the history. They’d started out translating punch cards to computers, and by the time I arrived they’d moved on to keying in all kinds of documents. Pre-OCR, all manual.

Don’t let that fool you though. There is a genuine science behind getting data into a computer fast and accurately. Validation rules, error rates, throughput targets, interface design that minimises mistakes. It sounds mundane until you realise how much of the world ran on people doing this work well.

It was a family-run business and the staff were brilliant. Functional in a way that a lot of workplaces aren’t. There was a lot of manual and technical work, but I got time to tinker and could use the PC in my free time. They trained me on the internals of the IBM PC, from repair work down to how the chips themselves operated. Plus whatever software packages we were selling at the time.

Eventually I got a full-time job out of it. They were paying me next to nothing, but I had no idea what people in my kind of role were actually making. I was just happy to be doing the work.

A Map of Ireland, Taped to a Monitor

Two projects from that period stand out.

The first was for ENFO, the Department of the Environment’s public information service. They wanted an application where users could type in requests for water and air quality reports across Ireland, and it would generate graphs for them to review. A visualisation tool, basically.

The software I built it in has been lost to time. It was a DOS application that used VGA graphics and a kind of no-code step builder for creating applications. Not exactly the tools you’d choose today, but it was what I had.

The tricky part was the map. They wanted a detailed map of Ireland as the main interface, and graphics on PCs were still uncommon enough that nobody had a ready-made solution for this.

So I improvised. I got a sheet of clear plastic, traced over a printed map of Ireland capturing the key regional boundaries, taped the plastic to the monitor, and then sat there recreating it pixel by pixel on screen. It took hours.

When I demo’d the finished application, the room went quiet and then people started asking how I’d managed such a detailed map on a PC. The application got presented to the public for a year or so.

The Leaving Cert Results

The second project carried a bit more weight, in more ways than one.

In 1992 the Department of Education was in the process of computerising the Leaving Cert exam results. OCR was still a young science and their initial tests were far from reliable enough to trust with something this important. If results got corrupted or delayed, every student in the country would feel it.

To make sure everything would be processed on time, they needed a data entry system set up as a fallback. I was the support person for “Key Entry III” at the time, so I spent most of that period in Athlone. Setting up systems, building interfaces, training the UK data entry staff (Irish people weren’t allowed to do the actual keying, for obvious confidentiality reasons), and then validating those results before they went into the Department’s processing systems.

The public only ever see the results landing on a page. Behind that was an incredible amount of work and orchestration to get everything running. The Department of Education staff had mostly all been hired around the same year, so everyone knew everyone. It felt more like a family that happened to be under serious deadline pressure. People worked hard and genuinely enjoyed doing it together.

There was some irony in the fact that I was helping get Leaving Cert results out to the country, having never sat my own. My boss at the time actually told me to never mention this to anyone. Fair enough.

Endings

During all of this, Punch and Programming went through a company change. New name, Data Entry Ireland. New management. The culture shifted. I was made redundant, one of the first to go before the whole place eventually shut down.

I was unemployed again. But some of the customers who’d been left in the lurch by the closure reached out for help, and one of them ended up hiring me.

That next chapter would teach me a very different lesson. The previous years had shown me I could build things that impressed people. What I still hadn’t learned was what those things were worth.


This is the second in a series of posts about how I ended up where I am today. Next: a 1920s book from a priest, a German laser printer, and the moment a CEO tried to take credit for my work.

Next Post | Previous post