Why does every debug start with a reproduction?

A reproduction gives you a binary signal that tells you whether each change moved you closer or further away. Without one, you are guessing. The first ten minutes nailing down the steps to make the bug appear pays for itself many times over.

Should I describe the error to Claude Code or paste it?

Paste it. Descriptions lose information. The raw error has the noun, the verb, the file, and the line, and that is usually enough to point at the fix without further investigation.

What does narrowing the surface mean in practice?

Cut the input down to the smallest case that still triggers the bug, cut the code path down to the smallest sequence that still fails, and delete imports, middleware, and conditions that are not involved. The smaller the surface, the cheaper each theory is to test.

When is git bisect the right move?

When the bug appeared between two known states. Tell git the last good commit and the first bad commit, run your reproduction at each step, and in about ten minutes you have the exact commit that introduced the regression.

Why write a failing test before the fix?

The test forces you to be precise about what wrong means, and it stays in the suite to prevent the same bug from coming back when the next refactor touches the same code.

All guides

Debugging with Claude Code: a steady method for hard bugs

Workflow 10 min read

TL;DR

Debugging with Claude Code is the same craft you already know with one more pair of eyes. Reproduce the bug, narrow the surface, read the real error, and write a test that fails before you write the fix. This guide walks through the method, the common traps, and the moves that turn a bug that has eaten your afternoon into one you ship a fix for before dinner.

Start with a reproduction

Every good debug starts with a reproduction. A way to make the bug happen on demand. Without that, you are guessing. With it, you have a binary signal that tells you whether each change you make moved you closer or further away. Spend the first ten minutes nailing down the minimum steps to make the bug appear. A failing test is the strongest version of this. A curl command that returns the wrong response is the next strongest. A click path in the UI is the weakest but still useful.

Once you have a reproduction, share it with Claude Code at the top of the session. Not just the symptom, the steps. Then ask it to help reproduce locally before either of you touches the code. Half of all debug time gets spent because someone tried to fix a bug they could not actually trigger, and the fix turned out to be for a different bug entirely.

Read the real error message

Read the error before you theorize. The full error, the stack trace, the line number, the surrounding lines if they are relevant. Paste it all into Claude Code rather than describing it in your own words. Descriptions lose information. The raw error has the noun, the verb, the file, and the line. Most of the time, that is enough to point at the fix without further investigation.

Copy the entire stack trace, not just the top line, since the cause is often three frames down
Include the request or input that triggered the error so the context is complete
Note the version of the dependency that threw the error, since the same error can have different causes across versions
Capture any preceding warnings, since they often point at the underlying state issue
If there are multiple errors, share them all and let Claude Code spot which one is the root and which are the cascade

Narrow the surface

Most bugs feel huge and turn out to live in a small surface area. Narrow before you dig. Cut the input down to the smallest case that still triggers the bug. Cut the code path down to the smallest sequence that still fails. Delete the imports that are not involved, the middleware that is not in the path, the conditions that always evaluate the same way. The smaller the surface, the cheaper each new theory is to test.

Bisect when nothing else fits

If the bug appeared between two known states, bisect. git bisect is the underused move that finds the offending commit in log n steps. Tell git the last commit you know was good, the first commit you know was bad, and let it walk you through. At each step, run your reproduction and tell git whether the bug is present. In ten minutes you have the exact commit that introduced the regression. From there, Claude Code can read the diff and propose the fix in one round.

Write the failing test first

When you have the reproduction tight enough to express in code, write the test first. The test should fail because of the bug. Run it, see it fail, then write the fix. The fix passes when the test passes. This pattern saves you twice. Once because the test forces you to be precise about what wrong means. And once again later, because the test stays in the suite and prevents the same bug from coming back when the next refactor touches the same code.

1Write a test that fails because of the bug, in the smallest possible scope
2Run the test and confirm it fails for the right reason, not for a setup mistake
3Ask Claude Code to propose a fix that makes the test pass without breaking neighbors
4Run the full test suite to confirm the fix does not break anything else
5Commit the test and the fix together so the history shows the bug and its proof of fix

Logs are stories, not noise

Treat logs as a story your system tells itself. When you tail logs during a reproduction, you are reading what the system thought was happening. The gap between what the system thought and what you wanted is where the bug lives. Add a temporary log line if the existing logs do not tell you enough. Remove it after. Claude Code is good at suggesting the right log lines because it has the full code in context and can spot which states have not been observed.

Structure matters for logs too. A log line that includes the request id, the user id where relevant, and a short event name reads ten times better than a raw printf. When you debug across a few hours, structured logs let you grep for the request that failed and see only its story instead of every concurrent story. The investment pays off the second time you debug something in the same system.

When the bug is in a dependency

Sometimes the bug is not in your code. It is in a library you use. The signs are familiar. The stack trace ends in node_modules, the behavior changed after an unrelated dependency bump, the same code fails on one machine and works on another. When this happens, pin the dependency version that works, open an issue upstream if it is not already filed, and add a comment in your code that records why the pin exists. Pinning is not surrender. It is preserving a known good state while the upstream fix lands.

The bug that is really three bugs

Some bugs are actually a stack of small issues that look like one big one. The race condition that only fires when the cache misses while the user is logged out on Safari. Untangle these one at a time. Fix the cache miss case, confirm the bug still partially repros, fix the logged out case, confirm again, then chase the Safari specific piece. Trying to solve all three at once produces a fix that works for none of them. Solving them one at a time produces three small commits and a clean PR.

Reading other people's code under pressure

Debugging often means reading code you did not write. A library function, a teammate's PR from last quarter, a vendor SDK. Claude Code is excellent at the read this and summarize what it actually does task. Drop the file in, ask for a plain language summary, then dig in. Two minutes of summary saves twenty minutes of guesswork. The summary also surfaces the assumptions baked into the code, which is often where the bug lives. Reading other code is a skill that compounds, and using Claude Code as a reading partner accelerates the compounding.

Postmortems for the bugs that mattered

Not every bug deserves a postmortem. The ones that reached production and affected users do. Write a short note that covers what happened, when it started, how you found it, what the fix was, and what change would have prevented it. Keep it under a page. The discipline of writing the note is the value. Half the time, the prevention idea you land on is a small lint rule or a single test you should have written months ago, and adding it is fifteen minutes of work that pays for itself the next time something similar would have shipped.

Asking Claude Code the right debugging questions

The quality of a debug session with Claude Code tracks the quality of the questions you ask. What changed since the last working state. Which assumption in this function is no longer true. Which input would make this code path execute. These questions point at causes. Generic questions like fix this bug point at symptoms and produce guesses. Train yourself to ask for the underlying state rather than for a patch. The patch follows once the state is clear, and the fix you get is better because the reasoning is exposed.

Heisenbugs and the timing class

Some bugs disappear when you look at them. Add a log line and the race condition does not happen anymore. Step through with a debugger and the order changes. These bugs are timing dependent and they punish slow methods. The way to catch them is to keep the production conditions and add tracing that does not change timing. Sampling, counters, structured logs to a sink rather than to stdout. Claude Code can help you instrument cleanly, but the discipline is yours. Do not chase a heisenbug with instrumentation that warps the timing you are trying to measure.

Concurrency bugs in particular reward writing a stress test rather than a single reproduction. Loop the trigger a thousand times, let the race fire under load, and you get a real signal instead of an occasional symptom. The stress test stays in your repo as a guard against regression. Future changes that reintroduce the race fail the stress test before they reach production.

Knowing when to walk away

Some bugs do not yield to a focused two hour session. The fix is to walk away. Close the laptop, do something else, come back fresh. The subconscious is good at problems that have been precisely defined and badly stuck. The half hour walk often produces the insight that an hour of staring did not. When you come back, restate the problem to Claude Code in one tight sentence, paste the smallest reproduction, and try again. The fresh framing alone usually shifts something loose.

Closing the loop

When the test passes and the fix is in, take five minutes to write what you learned. Not a long doc. A short note in the PR description or in a personal log. Future you, three months from now, will hit a similar bug and the note is the bridge. Members at claudecodeclub.ai trade these notes and the shared bug folklore is one of the unspoken benefits of the $9 a month, because most production bugs have been hit by someone else first. The /guides/git-workflow-with-claude-code guide pairs well with this method because it makes the test plus fix commit clean and easy to ship.

Common questions

Why does every debug start with a reproduction?
A reproduction gives you a binary signal that tells you whether each change moved you closer or further away. Without one, you are guessing. The first ten minutes nailing down the steps to make the bug appear pays for itself many times over.
Should I describe the error to Claude Code or paste it?
Paste it. Descriptions lose information. The raw error has the noun, the verb, the file, and the line, and that is usually enough to point at the fix without further investigation.
What does narrowing the surface mean in practice?
Cut the input down to the smallest case that still triggers the bug, cut the code path down to the smallest sequence that still fails, and delete imports, middleware, and conditions that are not involved. The smaller the surface, the cheaper each theory is to test.
When is git bisect the right move?
When the bug appeared between two known states. Tell git the last good commit and the first bad commit, run your reproduction at each step, and in about ten minutes you have the exact commit that introduced the regression.
Why write a failing test before the fix?
The test forces you to be precise about what wrong means, and it stays in the suite to prevent the same bug from coming back when the next refactor touches the same code.
How do I handle a bug that turns out to be in a dependency?
Pin the dependency version that works, open an issue upstream if it is not already filed, and add a comment in your code that records why the pin exists. Pinning preserves a known good state while the upstream fix lands.

More guides

Go from reading to shipping

Guides get you oriented. The club gets you shipping. Join Claude Code Club for $9/month.

Join the club See the curriculum

Related: the library, use cases, and the learn hub.