Published on August 16, 2025 10:32 PM GMT

I struggled with learning to debug code for a long time. Exercises for learning debugging tended focus on small, toy examples that didn't grapple with the complexity of real codebases. I would read advice on the internet like:

Try to create a reliable replication of the debugCreate a minimal demonstration of the bugRead error messages carefullyUse a debugger

I'd often be starting from a situation where the bug only happens sometimes and I have no idea when, the error log is about some unrelated library I was using and had nothing to do with what the bug would turn out to be. Getting to a point where it was even clear what exactly the symptoms were was a giant opaque question-mark for me.

I didn't improve much at debugging until I got generally serious about rationality training. Debugging is a nice in-between difficulty between "toy puzzle" and "solve a complex openended real world problem." (Code debugging is "real world problem-solving", but, it's a part of the world where you reliably know there's a solution, working in an environment you have near-complete control over and visibility into).

I attribute a lot of my improvement to fairly general problem-solving tools like:

Mentally prepare for a lot of complicated steps in a row

exactly

I've written other essays on general rationality/problem-solving. But, here I wanted to write the essay I wish past me had gotten, about some tacit knowledge of debugging. (Partly because it seemed useful, and partly because I'm interested in the general art of "articulating tacit knowledge").

Note that I'm still a kinda mid-level programmer and senior people may have better advice than me, or think some of this is wrong. This is partly me just writing to help myself understand, and see how well I can currently explain things.

Be willing to patiently walk up the stack trace (but, watch out for red-herrings and gotchas)

One core skill of debugging is the ability to patiently, thoroughly start from your first clue, and work your way through the codebase, developing a model of what exactly is happening. (Instead of reaching the edge of the current file and giving up)

Unfortunately, another core skill of debugging is knowing when you're about to follow the code into a pointless direction that won't really help you.

Gotcha #1: "X does not exist", "can't read X", "X is undefined", etc, are often "downstream symptoms", rather than the bug itself.

If you get an error like "X doesn't exist", there's a codepath that expects X to exist, but it doesn't. Whatever code caused it to not-exist isn't particularly likely to be located anywhere near the code that's trying to read X.

So, unless you look at the part of the codebase flagging "can't find X" and you can clearly see that X was supposed to be created in the same file or have a good reason to think it was created nearby, probably what you should instead be looking for are places that are supposed to create or retrieve X.

This goes double if the error is in some random library somewhere deep in your dependencies.

(I originally called this pattern a "red-herring." A senior-dev colleague told me they draw a sharp distinction between "red-herrings" and "downstream symptoms.")

Gotcha #2: Notice abstraction boundaries, and irrelevant abstractions.

The motivating incident that prompted this post was when I was pairing with a junior dev on debugging the JargonBot Beta Test. We had recently built an API endpoint^[1] for retrieving the jargon definitions for a post. It had been working recently. Suddenly it was not working.

We started with where an error was getting thrown, worked our way backwards in the code path... and then followed the trail to another file... and then more backwards...

...and then at some point both I and the junior dev said "it... probably isn't useful to look further backwards." (The junior dev said this hesitatingly, and I agreed)

We were right. How did we know that?

The answer (I think) is that if we stepped backwards any further, we were leaving the part of the codebase that was particularly about Jargon. Beyond that lay the underlying infrastructure for the LessWrong codebase, i.e. the code responsible for making API endpoints work at all. Last we checked, the underlying infrastructure of LessWrong worked fine, and we hadn't done anything to mess with that.

So, even though we were still thoroughly confused about what the problem was, it seemed at least like it should be contained to within these few files.

There was a chunk of the codebase that you'd naturally describe as "about the JargonBot stuff." And there was a point you might naturally describe as it's edge, before passing into another part of the codebase.

Now, that was sort of playing on easy mode – we knew we had just started building JargonBot, it would be pretty crazy for the bug to not live somewhere in the files we had just written. But, a few weeks later we were debugging some other part of the codebase that other people had written, awhile ago. (I believe this was integrating the jargon into the post sidebar, where side-comments, side-notes, and reacts live).

It turned out the sidebar was more complicated than I'd have guessed. The jargon wasn't rendering correctly. We had to pass backwards through a few different abstraction layers – first checking the new code where we were trying to fetch and render the jargon. Then back into the code for the sidebar itself. Then backwards into where the sidebar was integrated into the post page. And then I was confused about how the sidebar even was integrated into the post page, it wasn't nearly as straightforward as I thought.

Then I sort of roughly got how the sidebar/post interaction worked, but was still confused about my bug. I could recurse further up the code path...

...but it would be pretty unlikely for whatever was going wrong to be happening outside of the post page itself.

This all amounts to:

keep track of "abstraction boundaries" (that is, parts of the codebase that naturally group together for a purpose)think in terms of "what sort of hypotheses remotely make sense", to limit the search space of things you need to build an understanding of.

Binary Search

There is some intricate art to looking at a confusing output, and generating hypotheses that might possibly make sense. A good developer has both some object-level knowledge about their codebase, or codebases in general, that can help them narrow in quickly on where the problem is located.

I don't know how to articulate the details of that skill. BUT, the good new is there is a dumb, stupid algorithm you can follow to help you narrow it down even if you're tired, frustrated and braindead:

1: Follow the code backwards towards the earliest codepath that could possibly be relevant, and the earliest point where it is working as expected.

2. Find the spot about halfway between the earliest place where things work as expected, and the final place where you observe the bug. Log all relevant variables (or add a breakpoint and use a debugger).

3. If things look normal there, then find a new spot about halfway between that midpoint, and the final-spot-where-the-bug-was-observed. If it doesn't look normal, find a new spot about halfway between the earliest working spot, and the midpoint.

4. Repeat steps 2-3 until you've found the moment where things break.

Now you have a much smaller surface area you need to look at and understand. And instead of stepping through 100 lines of code, you had to find ~6 midpoints.

Binary searching through ~~time~~ spacetime

A variation on this is if you know the code used to work and now it doesn't. You can binary search through your history. If it worked a long time ago, on an older commit, you can binary search through your commit history. Start with the oldest commit where it worked, check a commit about halfway between that and the latest commit. Repeat. Git has a tool to help streamline process for this called git bisect.

If it worked earlier today (and you didn't make any commits in the meanwhile), you may need to do something more like "binary search through your undo history." This is complicated by:

you might have undo history in multiple fileseven within a given undo history, you also have to do the more classic "binary search from the earliest point in the codepath that things-look-correct to the latest point in the codepath where they don't."

Sometimes this neatly divides into "binary search through undo-space" followed by "binary search within one history-state of the codebase". But sometimes you need to kind of maintain a multidimensional mental map that includes both changes-to-the-codebase and places-within-the-codebase, and figure out what it even means to binary search that in a somewhat ad-hoc way I'm not sure how to articulate.

Notice confusion, and check assumptions.

Also, look at your data, not just your code.

Just yesterday, I was trying build a text editor that took in markdown files, and rendered them as html that I could edit in a wysiwyg fashion. At some point, even though it was supposedly translating the markdown into html, it was still showing up with markdown formatting.

This happened after I an llm made some completely unrelated changes that really shouldn't have had anything to do with that.

I was pulling my hair out trying to figure out what was going wrong, stepping back and forth through undo history, looking at anything that changed.

Eventually, I stopped looking at the codepath, and looked at the file I was trying to load into the editor.

At some point, I'd accidentally overwritten the file with corrupt markdown that was sort of... markdown nested inside html nested inside markdown. Or something.

Oh.

If I'd been a better rationalists, earlier in the hair-pulling-out-process, I could have noticed "this is confusing and doesn't make any goddamn sense." A rationalist should be more confused by fiction than reality. If I'm confused, one of my assumptions is fiction. And made a list of everything that could possibly be relevant, and see if there was anything I hadn't looked at yet, rather than looping back and forth in the undo history, vaguely flailing and hoping to notice something new.

But, it's kind of cognitively expensive to be a rationalist all the time.

A simpler thing I could have done is be better debugger with some object-level debugging knowledge, and note that sometimes, it's not the code that's wrong, it's the data the code is trying to operate on that is wrong. "Check the data" probably should have been part of my initial pass at mapping out the problem.

(This ties back to Gotcha #1: if you're getting "X is not defined", where X was created awhile ago and not by the obvious nearby parts of the codebase, sometimes try looking up X in your database and see if there's anything weird about it)

Things I don't know

So, that was a bunch of stuff I've painstakingly figured out. Some of it I got pieces of by pairing with other more senior developers. The senior developers I've worked with often are thinking so quickly/intuitively it's fairly hard for them to slow down and explain what's going on.

I'm hoping to live in a world where people around me get better at tacit knowledge explication.

Here's some random stuff I still don't have a good handle on, that I'd like it if somebody explained:

How do you learn to replicate bugs, when they happen inconsistently in no discernable pattern? especially when the bug comes up, like, once every couple days or weeks, instead of once every 5 minutes.

How does one "read the docs?". Sometimes I ask how a senior dev figured something out, and they say "I read the documentation and it explained it." And I'm like "okay, duh. but... there's so much fucking documentation. I can't possibly be expected to read it all?"

Do people just read really fast? I think they have some heuristics for figuring out what parts to read and how to skim, which maybe involves something like binary search and tracking-abstraction-borders. But something about this still feels opaque to me.

Something something "how to build up a model of the entire stack." Sometimes, the source of a problem doesn't live in the codebase, it lives in the dev-ops of how the codebase is deployed. It could have to do with the database setup, or the deployment server, or our caching later. I recently got a little better at sniffing but this still feels like a muddy mire to me.

^{^}
This isn't exactly the right description but is accurate enough to be an example.

Discuss

Be willing to patiently walk up the stack trace (but, watch out for red-herrings and gotchas)

Gotcha #1: "X does not exist", "can't read X", "X is undefined", etc, are often "downstream symptoms", rather than the bug itself.

Gotcha #2: Notice abstraction boundaries, and irrelevant abstractions.

Binary Search

Notice confusion, and check assumptions.

Also, look at your data, not just your code.

Things I don't know

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签