Part 7 · Advanced & Integration · Intermediate
Anatomy of a TB/RTL Race
A same-edge drive/sample race walked cycle by cycle, why ordering is nondeterministic, and the classic works-in-one-tool symptom.
The setup: two processes, one edge
A race needs only two ingredients: a testbench process that drives or samples a signal with blocking semantics at a clock edge , and an RTL process triggered by the same edge. The LRM does not define which same-region process runs first — that ordering is the simulator's choice, and it is exactly where determinism dies.
// RTL — a registered pipeline stage
always_ff @(posedge clk)
data_q <= data_in; // nonblocking: reads data_in in Active,
// updates data_q in NBA
// TESTBENCH — naive driver, same edge, BLOCKING assignment
initial begin
@(posedge clk);
data_in = 8'hA5; // blocking: updates data_in IMMEDIATELY
end
// TESTBENCH — naive monitor, same edge, immediate read
always @(posedge clk)
captured = data_in; // which value of data_in does this see?Why blocking at the boundary is the trigger
RTL written with nonblocking assignments is internally race-free: all reads happen before any updates.
A TB blocking assignment updates the signal mid-Active-region — while RTL processes at the same edge may or may not have read it yet.
A TB immediate read in the same region may execute before or after the TB drive — same ambiguity, sampling side.
The race, cycle by cycle
SAME-EDGE DRIVE/SAMPLE RACE — time step at posedge clk
ACTIVE REGION (process order is SIMULATOR'S CHOICE)
┌──────────────────────────────────────────────────────────┐
│ runnable: { always_ff (RTL), driver, monitor } │
│ │
│ ORDER A (driver first): │
│ 1. driver: data_in = 8'hA5 (blocking, now) │
│ 2. always_ff: reads data_in = A5 → schedules NBA │
│ 3. monitor: captured = A5 │
│ │
│ ORDER B (RTL first): │
│ 1. always_ff: reads data_in = OLD → schedules NBA │
│ 2. driver: data_in = 8'hA5 │
│ 3. monitor: captured = A5 (or OLD if monitor ran │
│ before the driver!) │
└──────────────────────────────────────────────────────────┘
│
▼
NBA REGION
┌──────────────────────────────────────────────────────────┐
│ data_q <= (whatever always_ff READ above) │
│ ORDER A: data_q becomes A5 — DUT saw the new value │
│ ORDER B: data_q becomes OLD — DUT saw the old value │
└──────────────────────────────────────────────────────────┘
Same source code. Two legal outcomes. THAT is the race.Both orders are legal per the LRM . The simulator may pick based on declaration order, elaboration order, optimization level, or internal heuristics — all of which can change when you add a signal, change a flag, or upgrade the tool.
Why it is nondeterministic — and the classic symptom
Sources of order variation
Different simulators make different (equally legal) ordering choices — VCS vs Questa vs Xcelium.
The same simulator can reorder after an optimization flag change (-O levels, partition options).
Unrelated code edits shift elaboration order, silently flipping which process runs first.
Even a passing test is not evidence of correctness — it is evidence you won the coin flip today.
THE CLASSIC SYMPTOM
Monday: test passes on Simulator X ← order A
Tuesday: same test fails on Simulator Y ← order B
Wednesday: "fixed" by adding a #1 delay ← hides the race
Thursday: fails again after -O3 recompile ← race is back
Friday: vendor support ticket: "your tool is broken"
vendor reply: "both behaviors are LRM-legal"
Root cause was never the tool. It was a blocking
drive/sample at the same edge the RTL uses.How to recognize one in the wild
A value is off by exactly one cycle, intermittently or per-tool — the signature of edge-order ambiguity.
Behavior changes with simulator, optimization flags, or unrelated edits — code is constant, ordering is not.
There is a TB blocking assignment or immediate read synchronized to the same edge as RTL always_ff blocks.
A #0 or #1 'fix' is in the history — someone treated the symptom before.
Key takeaways
A race is two same-region processes whose LRM-legal orderings produce different results.
TB blocking drives/reads at the RTL's active clock edge are the canonical cause.
Works-in-one-tool, fails-in-another is the fingerprint of a race, not a tool bug.
Delays like #1 hide races; they never fix them — the cure is region separation (next lesson).
Common pitfalls
Driving DUT inputs with blocking assignments directly at @(posedge clk).
Sampling DUT outputs with an immediate read in the same Active region the RTL writes them.
Declaring victory because the test passes on your simulator — ordering luck is not correctness.
Patching with #0/#1 delays, which shift the ambiguity instead of removing it.