Back when I started at Relic Entertainment one of the senior programmers had a printed list in his cubicle of “The 6 Stages of Debugging”. I'm not sure who the original author is, but the sentiments seem to be timeless.
If you're just learning to program, this is a good introduction to the experience of debugging. If you've been writing software for a while, it's a good refresher, or at the very least may provide some catharsis.
Let's go through each stage. The first one makes me laugh.
Stage 1: That can't happen
Denial. We probably know someone who has reacted that way. (I know I have)
When writing software, it’s tempting to think that we can hold everything in our heads. It’s tempting to believe we are certain of all possibilities.
But our memories are fallible. We’re building complex software, with complex frameworks, languages, toolchains, running on operating systems, in virtualized environments, etc. Writing software systems is complicated!
What’s worked for me is to keep an open and curious mind. Sure, sometimes a bug report is a misunderstanding (Maybe it's actually a UX bug?). But as I’ve built more and more software and systems over the years, I continue to come across strange and unexpected issues that turn out to be real bugs. Prepare to be surprised!
Stage 2: That doesn't happen on my machine
You’ve accepted that the bug report really happened. You read the reproduction steps and followed them.
It didn’t happen for you. Case-closed, right? Maybe it was a one-off? Maybe someone else has fixed it since then? Maybe. Or maybe not.
Did you follow all the steps? Did you try them several times? Sometimes we forget a step, or a piece of key configuration. Be diligent so the bug doesn't come right back with a note that it still happens with the exact same steps previously described.
If it still isn’t happening, you have something new to do: figure out what is different with your machine.
Developer machines or environments are often quite different from customer ones. You might have special tools, plugins, extensions, and development settings. You’ll want to compare and contrast environments, accounts, software versions, and more. You might end up with a second check-out of the code, or doing a rebuild from scratch.
It will be a process of elimination to narrow down what the key difference is. Once you find it, it’s often related to the bug in question. So you’ll have some strong clues when it comes to investigating why it’s happening.
Stage 3: That shouldn't happen
You now have steps that reproduce the bug.
Looking at the code, it’s unclear how the bug is happening at all. Everything looks right!
It might be a tricky scenario where nothing has changed recently, yet the bug just started happening. Or there is test coverage that should expose the bug, but it doesn’t.
This stage can be emotional. It’s easier when you start investigating and bam! The bug is right there. It’s frustrating when you analyze logic and it doesn’t match observed behaviour.
The key is to not get stuck here, and to move on to the next step: figuring out why it’s happening. Despite that it seems like it shouldn’t be possible.
Stage 4: Why does that happen?
You started by rejecting there was a bug, that it was possible. Then you followed the steps and couldn’t reproduce it on your machine. You figured that out, yet you were confounded - it shouldn’t happen!
You have arrived at: Why does that happen?
This is where the bulk of useful debugging work happens. This is the most effective question to ask.
Maybe tuning some data or changing the order of some code resolves the issue. Or reverting a recent commit. But why did it? What happened in the first place?
Use your debugging tools, skills, and techniques to find the why. Check logs messages. Add more logs messages. Attach debuggers. Review crash dumps. Make sure you're running the right version. Check your assumptions. Read over commit diffs and messages. Turn off parts of the codebase and see what changes. Set up a minimal reproducible example. Describe the bug to someone else (rubber duck technique).
Stage 5: Oh, I see
The ah-ha moment. Something clicked into place.
Now you know why the bug occurred, and sometimes you know right away how to fix it.
Aside from the realization and fixing the bug, consider:
- How can you prevent this type of bug in the future?
- Will your fix introduce the risk of more bugs?
- How can you share what you learned with your team?
- What can you do to speed up future investigations?
Hopefully you got the satisfaction of solving a puzzle, some new knowledge or insight, and built upon your debugging intuition. This is the happiest stage of debugging.
Just one more stage to go!
Stage 6: How did that ever work?
At first, you were sure there was no bug. Then you realized there actually is one. You found it. You now understand it, and inspect the code once more. You look at it differently now.
Your next thought is, “How did that ever work?”
It was broken from the start! Or was it? Context is important here, and perhaps the code worked fine in the past. Maybe it worked with older dependencies. Maybe it worked with previous versions of APIs. Or even it worked before a particular year. Like for example, the year 2000.
Maybe it worked fine on only your hardware or with your configuration. Sometimes the code was simply never executed before. It happens!
“How did that ever work?” is the next layer of understanding and learning. Finding the answer will help you with future debugging.
Do these stages match up with your debugging experiences? Do any of them resonate strongly, perhaps even hitting a little close to home?
I hope you know you're not alone when it comes to debugging. It's creative problem-solving, and help it out there. Check out DebugBetter.com for debugging techniques and tips. You got this!