Part 6 · Testbench Architecture · Intermediate

TB Race Conditions

Same-timestep event races, mailbox ordering assumptions, blocking vs NBA in TB code, clocking-block discipline, and a race-debug walkthrough.

Same-timestep races

Two TB threads active in the same timestep execute in an order the LRM leaves to the simulator. Any code whose correctness depends on that order is a race: it passes on one simulator (or seed, or version) and fails on another. The two classics are the trigger-vs-wait event race and read-vs-write on a shared variable .

diagram

TRIGGER vs WAIT — SAME TIMESTEP

  thread A:  -> ev;          thread B:  @(ev);

  Scheduler runs B first:   B waits, A triggers   → B wakes      ✓
  Scheduler runs A first:   A triggers (no one    → B waits
                            listening), then B     FOREVER        ✗

  The fix is a LEVEL, not an edge:

  thread A:  done = 1; -> ev;       // belt and braces
  thread B:  wait (done == 1);      // cannot miss a past assignment

  (event.triggered also works for same-timestep checks:
   wait (ev.triggered) is true for the whole timestep of the trigger)

@(event) is an edge: it only sees triggers that happen AFTER the @ executes. wait(flag) is a level: it sees any past assignment.
wait (ev.triggered) is the middle ground — true for the entire timestep in which the event fired, so same-timestep trigger/wait order stops mattering.
Shared-variable races (one thread writes txn_count, another reads it the same timestep) are resolved the same way clocked logic resolves them: separate the read and write into different regions or different timesteps.

Mailbox ordering and NBA discipline

Mailbox assumptions that fail

A single mailbox is FIFO — items come out in put() order. Everything beyond that is an assumption: when two producers put() in the same timestep, their interleaving is scheduler-dependent; and ordering across two different mailboxes is never guaranteed, no matter the put order. A scoreboard requiring "the expected item always arrives before the matching actual" across two mailboxes is a latent race — handle the actual-arrives-first case explicitly (pend it) or key by id.

systemverilog

// FRAGILE: assumes exp always arrives before act
task scoreboard::run();
  bus_txn exp, act;
  forever begin
    mbx_exp.get(exp);          // race: act may already be waiting,
    mbx_act.get(act);          // or arrive first next time
    void'(check(exp, act));
  end
endtask

// ROBUST: two independent threads + state, order-insensitive
task scoreboard::run();
  fork
    forever begin
      bus_txn t; mbx_exp.get(t);
      if (pending_act.size()) void'(check(t, pending_act.pop_front()));
      else                    pending_exp.push_back(t);
    end
    forever begin
      bus_txn t; mbx_act.get(t);
      if (pending_exp.size()) void'(check(pending_exp.pop_front(), t));
      else                    pending_act.push_back(t);
    end
  join_none
endtask

Blocking vs NBA when TB code touches signals

TB procedural code driving interface signals directly must use nonblocking (<=) at clock edges — a blocking write races every other process sampling that signal in the same timestep.
Inside class-only code (queues, counters, objects), blocking assignments are correct and normal — the NBA rule is about crossing into the signal world.
Best answer: do not drive raw signals from class code at all — go through a clocking block, which schedules the drive into the NBA region for you and applies output skew.

This is the same discipline the monitor lesson established for sampling: the clocking block is the race cure on both directions . Inputs are sampled in the preponed region (values from just before the edge), outputs are driven in the NBA region with skew — TB and DUT can no longer interleave within a timestep in an order-dependent way.

A race-debug story

diagram

SYMPTOM
  Regression: test passes 97/100 seeds on Simulator A.
  Same 3 seeds fail with "scoreboard leftover: 1 expected txn".
  On Simulator B, DIFFERENT seeds fail. Classic race signature:
  failure set changes with scheduler, not with stimulus.

  HUNT
  1. Failing seed, verbosity up: last expected txn pushed at t=84,210ns;
     monitor saw the matching bus activity at t=84,210ns too.
  2. Same timestep — suspicious. Inspect the monitor:
        @(posedge clk) if (vif.valid && vif.ready) ...   // raw sample
     and the driver:
        vif.valid = 1'b1;                                 // blocking drive!
  3. Diagnosis: driver's blocking write lands in the SAME timestep
     evaluation as the monitor's raw read. Scheduler order decides
     whether the monitor sees the final beat → leftover on the
     orderings where it misses it.

  FIX
  Driver drives through drv_cb (clocking block, NBA + output skew).
  Monitor samples through mon_cb (preponed sampling).
  All 100 seeds pass on both simulators.

  LESSON
  "Different seeds fail on different simulators" ≈ scheduling race.
  Audit every raw vif.sig read/write that bypasses a clocking block.

Interview angle

Race questions are a verification-interview staple: "A test fails on VCS but passes on Questa — first suspicion?" (same-timestep scheduling race; audit raw signal access and trigger/wait pairs). And the evergreen "Why do clocking blocks exist?" — answer in scheduling terms: preponed-region input sampling plus NBA-region skewed output driving removes timestep-internal ordering from the TB-DUT contract.

Key takeaways

@(event) misses same-timestep triggers — use wait(flag) or wait(ev.triggered) when arrival order is unknown.
FIFO order holds within one mailbox only — never assume ordering across two mailboxes.
Class-internal blocking assignments are fine; crossing into signals demands NBA — ideally via a clocking block.
Failures that move with simulator or seed-set are scheduling races, not stimulus bugs.

Common pitfalls

-> ev / @(ev) between threads with unknown ordering — the waiter sleeps forever on the wrong schedule.
Scoreboard get() on exp-then-act mailboxes assuming arrival order — pend the early side instead.
Blocking assignment to an interface signal at a clock edge — races every same-timestep sampler.
Verifying a race fix on one simulator only — the surviving orderings differ on the next one.

Practice this lesson