Part 8 · Senior & Interview Prep · Intermediate

The Senior Debug Playbook

Hub — failure triage, waveform strategy, testbench self-debug, X/race hunting, and three complete debug case studies.

Overview

Junior engineers debug by staring and re-running . Senior engineers debug with a system: classify the failure, form a hypothesis, design the cheapest experiment that falsifies it, and never trust a conclusion that lacks evidence. The difference shows most under pressure — at 2 AM before tape-out, the engineer with a playbook converges in an hour while the one without burns the night re-running the same seed with more print statements.

This topic builds the playbook in five layers: a triage method that decides where to even look, a waveform strategy that traces backward instead of scrolling forward, the discipline of suspecting your own testbench before the RTL, specialized hunts for X-propagation and races, and three complete worked case studies with timelines so you can see the method applied end to end.

Sub-topics

  1. Failure Triage Method — first-error discipline, error-signature classification, and the TB-vs-RTL-vs-spec verdict.

  2. Waveform Debug Strategy — backward-from-symptom tracing, first-divergence marking, and when waves mislead you.

  3. Debugging the Testbench Itself — all-pass-too-easily symptoms, null handles, mailbox deadlocks, instrumentation layers.

  4. Hunting X's and Races — back-propagation to the first X, reset audits, and deterministic-rerun techniques.

  5. Three Debug Case Studies — complete worked stories with timelines, each ending in a generalized lesson.

diagram
Legend: [TRIAGE] [WAVE] [TB] [X/RACE]

  THE DEBUG LOOP — every failure goes through this cycle

  failure report
       │
       ▼
  ┌─────────────────────┐
  │ 1. TRIAGE [TRIAGE]   │  first error only · classify signature
  │    where to look?    │  TB bug? RTL bug? spec ambiguity?
  └─────────┬───────────┘
            ▼
  ┌─────────────────────┐
  │ 2. REPRODUCE         │  same seed · same build · minimal test
  │    shrink the case   │  shorter sim · fewer agents · one txn
  └─────────┬───────────┘
            ▼
  ┌─────────────────────┐
  │ 3. HYPOTHESIZE       │  one suspect at a time
  │    cheapest test?    │  log? assertion? wave? code read?
  └─────────┬───────────┘
            ▼
  ┌─────────────────────┐
  │ 4. EXPERIMENT        │  waves [WAVE] · TB instrumentation [TB]
  │    falsify it        │  X-trace [X/RACE] · directed re-run
  └─────────┬───────────┘
            ▼
     hypothesis survives? ──no──► back to 3 with new evidence
            │ yes
            ▼
     fix + add a regression test that would have caught it

Key takeaways

  • Debug is a loop — triage, reproduce/minimize, hypothesize, falsify — not a vibe.

  • The first error in the log is the only one that matters; everything after it is usually noise.

  • Suspect the testbench before the RTL; TB bugs outnumber RTL bugs in mature environments.

  • Every fixed bug should leave behind a test or assertion that would have caught it.