Debugging - Ramona Sartipi

IBM

A low-code debugging experience that helps users reduce investigation time by 14%.

Problem

Users lack the insight and tools needed to diagnose and resolve errors when setting up and deploying custom extensions.

Deb, a developer, often receives ambiguous error messages and must rely on external tools and trial-and-error to identify the root cause.
Tanya, a non-technical conversation designer, only encounters problems during live testing. She lacks visibility into the source of the issue and depends on Deb to troubleshoot, leading to inefficient back-and-forth exchanges through screenshots and chat threads.

Neither user has access to a clear debugging workflow or feedback loop to evaluate extension performance post-launch.

🛤️ As-is journey

Current debugging workflow: slow, manual, and error-prone

👤 User Consequences

Without better error visibility & handoff, our users experience:

Frustration and lost time	Troubleshooting becomes slow and stressful.
Broken workflow	Users get blocked and can't move forward without help.
Poor collaboration	Designers and developers rely on back-and-forth messaging to fix problems.
Lack of confidence	Users can't trust that their extensions will work reliably.
Abandoned work	Users may give up on custom extensions altogether.
Limited growth	Users avoid experimenting or expanding their solutions due to complexity.

💼 Business Consequences

The complexity discourages adoption. When debugging is opaque:

Increased support costs	More tickets and time spent by internal teams to troubleshoot user issues.
Slower time-to-value	Delays in launching assistants reduce the speed at which customers see ROI.
Lower feature adoption	Custom extension capabilities remain underused due to perceived complexity.
Decreased customer satisfaction	Frustrated users are less likely to expand their use of the platform.
Churn risk increases	Teams may abandon the product for simpler or more transparent alternatives.
Hindered scalability	Customers hesitate to build or expand their chatbot using custom extensions, limiting client growth opportunities.

Role

Lead UX Designer

Year

2024

Team

Technical PM, Dev Team

Project type

Design-driven requirements (Exploratory)

Deliverables

Problem Definition 3inB Workshopping Usability testing Prototyping Executive Presentations Final Specs

Solution

We introduced a built-in debugging view directly in the watsonx Assistant chat preview, enabling both Tanya and Deb to understand and resolve issues faster—together or independently.

What we designed:

For Tanya (non-technical):

Clear status indicators and step-level error flags
Human-readable messages to guide decision-making
One-click sharing of errors with Deb

For Deb (technical):

Structured cURL command and JSON payloads to replicate issues
Validation flags for missing or incorrect parameters within the builder
Dashboard to flag, track and resolve errors

Impact

🛠️ Debugging time reduced
🗂️ Fewer extension-related tickets to support
🚀 High adoption during beta phase

✨ Key improvements

Area	Before	After
Error handoff	Long threads with screenshots	One clean, structured message
Triage time	Manual diagnosis with external tools	Errors prioritized and replicated within 5 minutes
User autonomy	Heavy reliance on support	Tanya can fix light issues herself, Deb gets what she needs

💡 Why it matters

By enabling seamless, role-appropriate debugging, we turned a high-friction workflow into a more empowering, efficient experience. Tanya and Deb now have a shared language and set of tools to build, test, and improve custom extensions—reducing support dependency and increasing system reliability.

📚 Why the solution worked

Reduced internal support tickets related to extension failures
Improved troubleshooting speed and accuracy, especially for non-technical users
Enabled clean handoff to developers, with logs that made it easy to replicate and fix issues
Increased confidence and adoption of custom extensions in production flows

🔬 Workshop driven research

Due to time constraints, we conducted a focused co-design workshop instead of formal interviews. We kept the group tight-knit to optimize for time, we had our technical PM (who acted as our SME & users), design lead (me) and 2 developers.

Workshop process
We started by mapping out the current (as-is) experience and identifying pain points across three key scenarios where Tanya's extension doesn't work as expected:

When Tanya tests her extension in the chat preview, it returns an error.
When the extension runs, but the output isn't what Tanya expected—without showing any error.
Tanya tries to share the issue with her developers through emails and screenshots, but they can't reproduce the problem on their end.

From there:

We translated pain points into prioritized need statements
Converted needs into measurable "hills" (north star goals)
Answered most of our assumptions + questions
Created a "To-Be" journey map outlining how debugging should ideally work
Benchmarked debugging tools from market leaders like Twilio, DataDog, Postman, and other platforms
Held a sketching session for the north star and voted on feasible, usable directions

To-Be Goals:

Tanya can debug light errors solo
Tanya can escalate to Deb in a single, understandable message
Deb can reproduce and resolve errors in under 5 minutes
Tanya can check if Cade (end user) dropped due to an extension failure
Deb can compare error severity and prioritize fixes at scale [out of scope for phase 1]

✏️ Sketching session

Sketching session revealed a lot of how we would implement the solution

What information is required for Tanya and Deb to debug effectively?
Deb: We decided to surface the cURL command and JSON payload directly in the debugging interface, enabling Deb to replicate and debug issues independently. This approach gives her an exact snapshot of the failing request, allowing her to reproduce the issue locally, test variations, and isolate the root cause—without relying on Tanya for additional context. It reduces back-and-forth, speeds up issue resolution, and empowers developers to work more efficiently.

For Tanya, we prioritized clear status indicators to help her quickly understand what went wrong—without needing to dig into technical logs or error codes. As a non-technical user, Tanya shouldn't need to interpret stack traces or HTTP status codes to know there's a problem. Clear, human-readable messages and visual cues (like warning icons or step-level error flags) help her confidently identify issues, decide when to involve Deb, and contribute meaningfully to the debugging process. This reduces frustration and empowers Tanya to stay in flow while building conversations.

How do we not overwhelm Tanya with information overload?
Creating a dedicated section, progressive disclosure

How will handoff work?
We decided to surface the cURL command and JSON payload for Deb, enabling her to replicate and debug issues independently. For Tanya, we prioritized clear status indicators, making it easy for her to understand what went wrong without needing technical context.

How do we give users value in a small space?
Our biggest constraint was real estate, how can we use UX tricks to give user contextual support while they investigate problems? Tanya and Deb primarily use a small webchat conduct most of their deb

The north star:

Iteration (More Designs)
The MVP started as a standalone debug tab, but testing showed users preferred to see the data inline within the conversation steps. We pivoted:

Moved debug info next to each relevant step
Created a progressive disclosure flow (summary first, then details)
Incorporated color-coded indicators to show fail/pass status instantly

Why the solution worked:

Reduced internal support tickets related to extension failures
Improved troubleshooting speed and accuracy, especially for non-technical users
Enabled clean handoff to developers, with logs that made it easy to replicate and fix issues
Increased confidence and adoption of custom extensions in production flows

Why the solution worked

Reduced internal support tickets related to extension failures
Improved troubleshooting speed and accuracy, especially for non-technical users
Enabled clean handoff to developers, with logs that made it easy to replicate and fix issues
Increased confidence and adoption of custom extensions in production flows