How To Fix A Software Bug

  7 mins read  

If you’re going to develop software of any kind beyond the most trivial, you will one day run in to a bug. What’s potentially worse is that on a team of sufficient size, there’s a good chance that you won’t have had anything to do with creating the bug, you’re just the unlucky shmuck that gets to go in after it and fix it.  This is especially true if you’re a newer developer, since software teams love to set new guys on the bug pile. After all, nobody really _likes _fixing bugs, but they’re an excellent way to learn a system without messing with things that you probably shouldn’t touch if you’re not intimately familiar.

With that said, it really helps to have a consistent strategy and system for tackling bugs. Writing software is a craft. Just because the software you’re writing is an attempt to fix something broken rather than iterating the next awesome greenfield feature doesn’t mean you shouldn’t apply the same level of planning and clear thinking to the task.  In this post, I’m going to outline my own approach to bug fixes, the one that I teach to everyone who comes across my desk looking for advice, (even if they didn’t specifically ask about bug fixing — I can be incredibly pedantic sometimes…).

Always Reproduce The Bug

You can’t fix what you can’t see, and guessing at the source of a problem is a fantastic way to spend hours messing around with code that has nothing to do with the bug you’re trying to fix. Even if you’re intimately familiar with a code-base, even if you’re dead-certain that you know where the problem is and how to fix it, reproduce the problem before you write a single line of code.  This is such an important rule that you should always insist that whomever brings you bug reports  include reproduction steps in their bug description.

Logs and Stack Traces and Debugging Tools Are Your Friend

Unless you believe your tester is lying to you just to make your day interesting, you can probably trust that what they say is happening is happening.  The point of reproducing the bug yourself isn’t purely to confirm that the bug exists, it’s also to gather information about it.  Once you see a bug in action, you probably can start making some guesses about where the problem might lie, but you shouldn’t be digging in to the code just yet.

Regardless of the type of programming you’re doing, you will have tools at your disposal for ‘diagnosing the system’ when things go wrong.  For web application developers there are typically activity logs that monitor the behavior of the application in real time, including displaying stack traces when things go really wrong.  For compiled languages, (and even some dynamic ones),  there are a plethora of debuggers that will allow you to step through your code line-by-line and examine the state of your application.  Both of these tools let you pop the hood open while the engine is running, so to speak, and you should make liberal use of them to uncover the ‘truth’ behind the bug you’re attempting to fix. If you’re not comfortable reading the stack traces and logs for the language and platform of your choice, get comfortable.

The Smallest Change That Works

To understand this principle more thoroughly, check this out. If you don’t feel like reading a longish, (but very informative!), article, the main takeaway is this:

The best design is the one that allows for the most change in the environment with the least change in the software.

The reality is that this principle holds for all software development, not just fixing bugs.  However, I’d argue that it’s especially important for bug fixing, because the worst possible thing you could do would be to introduce more bugs in the process of trying to fix one that already exists.

Of course, bear in mind that “the smallest change that works” doesn’t mean throw in a sloppy hack that works to alleviate your problem, but fails to alleviate any underlying causes and/or adds fragility to the system.  Remember, software development is a craft, and just like slapping a bit of unpainted plaster on a wall with a hole in it is ugly and unacceptable, quick and dirty fixes in software are unacceptable too.

Don’t Just Throw It Against The Wall

One common ‘debugging’ technique I’ve seen is to throw in a random change, fire up the program, and see if it worked.  Do not do this.  For one thing, doing this is just sloppy.  You are a craftsman, and your goal in fixing a bug is to craft a solution to a problem. If you’re so unsure of your proposed change that you need to iteratively ‘see it in action’ as you build it, you’re not thinking it through carefully enough, and not thinking it through is a sure way of producing something ugly and hacky.  The other problem with this technique is that by the time you finally do get something against the wall that sticks, you’ll have expended so much energy that you’re likely to leave whatever it is up there, regardless of how ugly or fragile or uncraftsman-like it is.

Think your fixes through. Make an effort to make them simple and elegant. You’re not just fixing a bug, you’re crafting the system. Evaluate what your doing and be confident that you’re taking a reasonable approach to the problem that maintains the integrity of the whole before you run your solution through its paces.

Test Your Change

You’ve reproduced the bug, you’ve narrowed down it’s cause through stack-traces, logs, debuggers or whatever other investigatory tool you have, and you’ve applied an elegant fix that you believe will handle the problem without introducing further issues into your software. Bravo. Now, it’s time to make sure that what you’ve attempted actually works.

The simplest way to test your change is to run through the repro steps again and see if the bug re-asserts itself. If it does, it’s back to the drawing board.  If it doesn’t, undo your fix and make sure it fails again. If it doesn’t _fail even after you’ve removed the code that you thought was responsible for _fixing it, then it was fixed unintentionally, and that’s _bad. _To understand why it’s bad, I recommend reading about Programming By Coincidence. If you aren’t 100% perfectly clear as to why your bug is suddenly “fixed”, it’s back to the drawing board until you _do _understand.

Test Your Code

I don’t want to make this a post about testing code.  That’s a subject for another post and one that I could spend a lot of time on, so I’ll keep this section short. Well written automated tests for your code prevents bugs.It’s that simple.  The best bug is the one you don’t have to fix because it never existed, or because the problem was caught before the code was ever rolled out to production to percolate in the code base and wreak havoc.

However, sometimes the only way to learn that the stove is hot is to burn yourself — just make sure you learn the lesson!  When you catch a bug, write an automated test on the fix!  Make sure that the bug you just found can never rear it’s ugly head again without getting seen before it becomes a real problem.

Wrapping Up — TL;DR

Fixing bugs is something every software developer will have to do at some point — probably many points, repeatedly, for their entire career.  Being good at fixing bugs is really no different than being good at developing software. It’s all the same thing, and regardless if you’re fixing bugs or building new features, you are crafting a program. Just like there is a process to writing good code, there is a process to fixing bugs:

  1. Always reproduce the bug.
  2. Gather as much information as possible before attempting a fix.
  3. Fix the bug with the smallest possible change.
  4. Ensure that your fix actually fixed the bug.
  5. Add unit tests to prevent regression.
  6. Add unit tests to all code to prevent bugs in the first place.

As always, comments, thoughts, disagreements and additions are welcome!