Part 3 · Constraint Randomization · Intermediate

Reproducing Random Failures

Seed capture and replay, random stability hazards, srandom isolation, regression seed management, and logging discipline.

No repro, no debug

A random failure you cannot reproduce is not debuggable — it is folklore. The entire value of constrained-random verification rests on a contract: given the same seed, the same simulator version, and the same source code, every run produces bit-identical random streams. Your job is to never break that contract accidentally, and to capture enough information from every failing run to invoke it. The two halves of this lesson: the seed capture/replay workflow and the random stability rules that determine when replay actually works.

diagram

SEED CAPTURE AND REPLAY FLOW

  REGRESSION (1000 seeds)
  +---------------------------------------------+
  | for each seed S in seed_list:               |
  |   simv +ntb_random_seed=S ... -l run_S.log  |
  |   log line 1: "SEED: S"   <- ALWAYS printed |
  +---------------------------------------------+
            |
            v
     seed 4711 FAILS
            |
            v
  REPLAY: same binary, same options, +ntb_random_seed=4711
            |
            +-- fails identically? --yes--> debug deterministically
            |                               (waves, prints, solver dump)
            no
            |
            v
  STABILITY BROKEN - check before debugging anything else:
    - same simulator version + compile options?
    - source changed since the regression? (new rand var,
      new class instance, reordered new() calls)
    - time-dependent or wall-clock input sneaking in?
    - multi-thread / save-restore nondeterminism?

The single highest-value habit is the boxed line in the diagram: print the seed at the top of every log, unconditionally. A failure report without a seed is a failure you may never see again. Most flows pass the seed via a plusarg (+ntb_random_seed for VCS, -svseed for Questa, -seed for Xcelium) and echo it with $display or the UVM banner; some also stamp it into waveform and coverage file names so artifacts are traceable to the run.

Random stability: how streams are seeded

SystemVerilog defines hierarchical seeding: every module instance, interface, and thread gets its own random number generator (RNG), seeded deterministically from its parent's RNG at creation time. Every class object likewise gets its own RNG, seeded from the RNG of the thread that calls new() . randomize() draws from the object's own RNG; $urandom draws from the calling thread's RNG. This is what makes replay possible — and what makes it fragile, because the seeding of each RNG depends on creation order within its parent.

systemverilog

// HAZARD: adding one $urandom call shifts everything after it
initial begin
  txn a = new();          // a's RNG seeded from thread RNG draw #1
  void'($urandom);        // <-- NEW debug line: consumes a thread draw
  txn b = new();          // b's RNG now seeded from a DIFFERENT draw
  // Every value b ever randomizes has changed. Same seed, same
  // test, "unrelated" one-line edit - failure moved or vanished.
end

// HAZARD: object creation order
initial begin
  txn p, q;
  if (cfg_new_feature) p = new();   // feature flag flips creation
  q = new();                        //   order -> q's seed changes
end

// SAFE: thread structure is stable, draws are stable
initial begin
  txn a = new();
  txn b = new();
  // new debug $display() calls are harmless - they draw nothing
end

The practical rule: random streams are stable under edits that do not add, remove, or reorder RNG draws (object construction, $urandom/$urandom_range calls, randomize calls, process spawns) upstream of the failure point . Adding a $display is safe; adding a debug object construction or a sampling $urandom is not. When you must add such code while chasing a seed-locked failure, isolate it with srandom — next section.

srandom: manual seeding for isolation

systemverilog

// Re-seed one object's RNG to a fixed value, detaching it from
// the hierarchical seeding chain entirely.
txn t = new();
t.srandom(32'hDEAD_BEEF);     // t's stream now fixed regardless of
assert(t.randomize());        // what happened earlier in the thread

// Pattern 1: pin the suspect, let everything else vary
//   Replay seed 4711, but srandom() the failing transaction object
//   so you can add debug code elsewhere without disturbing it.

// Pattern 2: per-component decoupling at construction
class my_driver;
  function new(string name, int unsigned stable_seed);
    this.srandom(stable_seed ^ name.len());  // derived, but explicit
  endfunction
endclass
//   Components no longer inherit seeds positionally - adding a new
//   agent to the env does not shift every other agent's stream.

// Pattern 3: process-level isolation
initial begin
  process::self().srandom(32'h1234);  // pin this thread's RNG
  // $urandom calls in this thread are now independent of
  // upstream thread-spawn ordering.
end

srandom converts implicit positional seeding into explicit chosen seeding. UVM builds Pattern 2 into the methodology — each component's RNG is seeded from a hash of its full hierarchical name and the global seed, so adding a component does not perturb siblings' streams. If you are on raw SystemVerilog, doing the same by hand (seed derived from a stable name, not from creation order) buys you most of that robustness.

Regression seed management and logging discipline

Nightly regressions: random seeds, but every seed recorded in the run database next to pass/fail status — a failure row is (test, seed, simulator version, commit hash).
Failure triage: rerun the exact (test, seed) pair before touching code; confirm the failure reproduces, then archive the log and waves under that seed's name.
Fixed smoke seeds: a small set of known-good seeds run on every commit for fast sanity; random seeds explore, fixed seeds gate.
Seed sweeps for soak: when a bug is suspected to be seed-sensitive, sweep a contiguous block (1..500) to estimate failure rate and harvest more failing seeds.
Never reuse 'seed 1 everywhere' for the whole regression — thousands of runs exploring the same stream is wasted compute and false confidence.
Print the seed in: the log header, the UVM report summary, the waveform filename, and any failure-ticket template. Redundancy here is free; a lost seed is not.

systemverilog

// Make the seed un-losable: bench snippet
module tb_top;
  initial begin : seed_banner
    int unsigned seed;
    if (!$value$plusargs("ntb_random_seed=%d", seed)) seed = 1;
    $display("=========================================");
    $display("  SEED: %0d  (replay: +ntb_random_seed=%0d)", seed, seed);
    $display("=========================================");
  end
endmodule

Interview angle

“A test fails once in a 1000-seed regression — what do you do?” The strong answer: pull the seed from the run database, replay the exact (test, seed, build) triple, and confirm bit-identical failure before any debugging. Then explain random stability: per-object RNGs seeded by creation order within the parent thread, so debug edits that construct objects or call $urandom upstream of the failure will shift streams and lose the repro — which is why you add only non-drawing code ($display, waves) or pin the suspect object with srandom. Mentioning that UVM seeds components from hierarchical names (immunizing sibling components against each other) shows methodology depth.

Key takeaways

Same seed + same build + same source = identical streams; that contract is the foundation of random debug.
Print the seed in every log unconditionally — a failure without its seed may be gone forever.
Streams are positionally seeded: adding/removing/reordering RNG draws upstream shifts everything after.
$display is replay-safe; new object construction and $urandom calls are not.
srandom (object, or process::self()) detaches a stream from positional seeding for surgical debugging.

Common pitfalls

Debug code that calls new() or $urandom before the failure point — the failure mutates or disappears.
Replaying on a different simulator version or compile flags and concluding 'not reproducible'.
Logging pass/fail without the seed — the regression result is unactionable.
Conditional object construction (feature flags) silently reordering RNG seeding between configs.
Sweeping regressions with one fixed seed — coverage illusion from rerunning a single stream.

Practice this lesson