LLM-Assisted Debugging: Workflows That Actually Work
GitHub's 2024 internal research on Copilot Chat found developers accept LLM-generated fixes in roughly 31% of debugging sessions — but only 11% of those fixes actually closed the underlying bug. The other 20% patched a symptom, introduced a regression, or confidently pointed at the wrong subsystem. An ACM 2024 study from Shi et al. on LLM-assisted debugging across 2,500 sessions reported a similar pattern: speed-up happens on shallow bugs; deep bugs often get worse when the developer outsources hypothesis generation.
The takeaway is not "don't use LLMs to debug." It's: use them where they're measurably better, skip them where they systematically lie, and build a workflow around the difference. This post walks five workflows that actually save time, drawn from instrumenting our own team and five PanDev Metrics customer teams.
