Part 7 · Advanced & Integration · Intermediate

A Speed-vs-Debug Strategy

Regression vs debug run profiles, plusarg/define switching, the two-pass failure workflow, and what senior engineers do differently.

Two profiles, one testbench

Everything in this topic converges into one operational idea: define exactly two run profiles — a fast regression profile and a full debug profile — and make every visibility feature switchable between them at runtime. One compiled testbench, two behaviors, zero ad-hoc tinkering per run.

diagram
THE TWO PROFILES

  Feature              REGRESSION (fast)         DEBUG (visible)
  ───────────────────  ────────────────────────  ──────────────────────
  Waves                OFF                       ON (scoped + windowed)
  Log verbosity        errors + summary          full per-txn detail
  Functional coverage  ON (it's why we run)      OFF (not needed to debug)
  Code coverage        merge runs only*          OFF
  Assertions           ON  ← non-negotiable      ON
  Txn debug logs/pools OFF                       ON
  Seed                 random, LOGGED            forced to failing seed

  * where legal per project sign-off rules — coverage-off runs
    must never be the runs used for closure metrics.

  Assertions stay ON in both profiles: a fast run that
  cannot detect failure is a waste of a fast run.

Switching: plusargs vs defines

  • Plusargs ($test$plusargs / $value$plusargs) — runtime switch, no recompile; right for waves, verbosity, debug pools.

  • Compile defines (`ifdef DEBUG_TB) — eliminate code entirely; right for heavyweight instrumentation that costs even when gated.

  • Prefer plusargs by default: one binary serves both profiles, and regression infrastructure stays simple.

  • Reserve defines for cost that survives a runtime guard (e.g. bind-in debug monitors, huge static debug arrays).

systemverilog
// One testbench, profile chosen at run time
module tb_top;
  bit dbg_waves;
  int unsigned verbosity = 100;

  initial begin
    dbg_waves = $test$plusargs("WAVES");
    void'($value$plusargs("VERBOSITY=%d", verbosity));

    if (dbg_waves) begin
      $dumpfile("waves.vcd");
      $dumpvars(2, tb_top.dut);          // scoped even in debug
    end
  end

  // Heavy instrumentation that costs even when idle: compile it out
`ifdef DEBUG_TB
  txn_history_pool #(.DEPTH(100000)) hist();   // big static debug array
`endif
endmodule

// regression:  ./simv +ntb_random_seed=auto              (seed logged)
// debug:       ./simv +ntb_random_seed=7731 +WAVES +VERBOSITY=300

The two-pass failure workflow

  1. Pass 1 — regression profile: thousands of tests, no waves, quiet logs, random seeds logged per run.

  2. Harvest each failure: test name, seed, plusargs, and approximate failure time from the error message.

  3. Pass 2 — debug profile: rerun only the failures with the exact seed, waves scoped to the suspect block, window opened before the failure time, verbosity raised.

  4. Debug with full visibility; fix; rerun the single test in regression profile to confirm.

  5. Return the test to the nightly pool — never leave it running in debug profile.

The workflow depends on one non-negotiable property: reproducibility . Same seed plus same command line must produce the same behavior. Anything that breaks seed stability — unseeded $urandom in a stray module, wall-clock-dependent code, ordering dependence between processes — breaks the entire strategy.


What senior engineers do differently

Junior vs senior, in practice

  • Junior: runs every sim with waves on 'just in case'. Senior: runs blind, reruns failures with waves scoped and windowed.

  • Junior: guesses why the sim is slow and starts editing. Senior: profiles or runs differential A/B configs first.

  • Junior: adds $display everywhere when debugging. Senior: raises verbosity on an existing filtered logging system.

  • Junior: discovers the memory leak when the farm kills the job. Senior: has container watermarks reporting from day one.

  • Junior: disables assertions to make the regression faster. Senior: treats assertion cost as the price of meaning.

Consolidated checklist

  1. Profile before optimizing — vendor profiler or differential runs; fix the top consumer only.

  2. No polling loops — every wait is @(edge) or wait(expr).

  3. No string formatting or deep copies in hot paths — guard first, copy once.

  4. Covergroups sample per transaction; randomize() narrowly inside tight loops.

  5. Every scoreboard queue is bounded with a loud error on overflow.

  6. Sparse storage uses associative arrays; entries deleted on retire; watermarks reported.

  7. Regression: waves off, logs quiet, coverage on, assertions on, seeds logged.

  8. Debug: rerun failing seed with scoped + windowed waves and raised verbosity.

  9. Everything switchable by plusarg; defines only for cost that survives runtime guards.

Key takeaways

  • Define exactly two profiles — fast regression and full debug — and switch with plusargs.

  • Assertions stay on in both profiles; coverage stays on wherever closure is measured.

  • The two-pass workflow only works if seeds are logged and runs are reproducible.

  • Senior behavior is measurement plus discipline — not tool tricks.

Common pitfalls

  • A 'temporary' debug profile run left in the nightly list, taxing every regression after.

  • Turning coverage off in the runs that feed closure metrics — sign-off on missing data.

  • Debug instrumentation behind a runtime flag that still allocates gigabytes when disabled.

  • Unseeded randomness anywhere in the TB — the failing seed no longer reproduces the failure.