Part 6 · Testbench Architecture · Intermediate

Compare Discipline & Mismatch Debug

Designing txn compare() with field masks and don't-cares, first-mismatch context dumps, error thresholds, and triage flow.

Designing compare()

The compare method is where checking policy lives, so make it explicit. Not every field participates in every compare: timestamps differ by construction, debug ids are TB-internal, and some fields are don't-care in specific modes (a read transaction's write-data, an errored response's payload). Encode that as a field mask rather than scattering if-statements through the scoreboard.

systemverilog

class bus_txn;
  typedef enum int unsigned {
    F_ADDR = 'h1, F_DATA = 'h2, F_RESP = 'h4, F_LEN = 'h8
  } field_e;
  localparam int unsigned F_ALL = 'hF;

  logic [31:0] addr;
  logic [31:0] data;
  logic [1:0]  resp;
  int unsigned len;
  time         observed_t;   // never compared — metadata
  int unsigned dbg_id;       // never compared — TB bookkeeping

  // mask selects which fields participate; mismatch_desc explains failures
  function bit do_compare(bus_txn rhs,
                          int unsigned mask = F_ALL,
                          output string mismatch_desc);
    bit ok = 1;
    mismatch_desc = "";
    if ((mask & F_ADDR) && addr !== rhs.addr) begin
      ok = 0;
      mismatch_desc = {mismatch_desc,
        $sformatf(" addr(exp=0x%08h act=0x%08h)", addr, rhs.addr)};
    end
    if ((mask & F_DATA) && data !== rhs.data) begin
      ok = 0;
      mismatch_desc = {mismatch_desc,
        $sformatf(" data(exp=0x%08h act=0x%08h)", data, rhs.data)};
    end
    if ((mask & F_RESP) && resp !== rhs.resp) begin
      ok = 0;
      mismatch_desc = {mismatch_desc,
        $sformatf(" resp(exp=%0d act=%0d)", resp, rhs.resp)};
    end
    if ((mask & F_LEN) && len != rhs.len) begin
      ok = 0;
      mismatch_desc = {mismatch_desc,
        $sformatf(" len(exp=%0d act=%0d)", len, rhs.len)};
    end
    return ok;
  endfunction

  function string convert2string();
    return $sformatf("addr=0x%08h data=0x%08h resp=%0d len=%0d t=%0t",
                     addr, data, resp, len, observed_t);
  endfunction
endclass

// Scoreboard call site: payload is don't-care on an error response
// mask = (exp.resp != 0) ? (bus_txn::F_ALL & ~bus_txn::F_DATA)
//                        : bus_txn::F_ALL;

Use !== (4-state compare) on logic fields — an X in DUT output must fail, not silently match.
The mask is decided by the scoreboard per the spec ("data is undefined on SLVERR"), not hard-coded into the transaction.
mismatch_desc names every differing field with both values — the report writes itself.

First-mismatch reporting: dump everything, once

The first mismatch is the one closest to the bug; the hundreds after it are usually the same failure cascading. So invest the report budget at the first failure: dump both complete transactions, the differing fields, the position in the stream, and the simulation time — enough for an engineer to start debugging without re-running the simulation .

systemverilog

class scoreboard;
  int unsigned match_count, mismatch_count;
  int unsigned max_mismatches = 10;     // stop the cascade

  function void check_pair(bus_txn exp, bus_txn act, int unsigned mask);
    string desc;
    if (exp.do_compare(act, mask, desc)) begin
      match_count++;
      return;
    end
    mismatch_count++;
    $error({"[SCB] MISMATCH #%0d at compare index %0d, time %0t\n",
            "  fields :%s\n",
            "  expect : %s\n",
            "  actual : %s"},
           mismatch_count, match_count + mismatch_count, $time,
           desc, exp.convert2string(), act.convert2string());
    if (mismatch_count >= max_mismatches)
      $fatal(1, "[SCB] mismatch threshold (%0d) reached — aborting",
             max_mismatches);
  endfunction
endclass

Why an error threshold

One real bug typically corrupts every subsequent compare (especially in-order queues that fall out of step) — 50,000 identical errors hide the one that matters.
$fatal after N mismatches keeps the log readable and the regression farm fast; the first report has everything needed.
Keep N configurable (plusarg) — set it higher when hunting an intermittent secondary failure.

Mismatch triage flow

diagram

MISMATCH TRIAGE FLOW

  first MISMATCH report
        │
        ▼
  Which fields differ?  (from mismatch_desc)
        │
        ├─ ALL fields, queue out of step
        │      └─► alignment bug: dropped/duplicated txn upstream
        │          → check monitor txn counts on both sides first
        │
        ├─ data only, addr/len match
        │      └─► real datapath candidate OR model formula bug
        │          → recompute by hand from the spec
        │              ├─ spec agrees with model → suspect RTL  → waveform at exp time
        │              └─ spec agrees with DUT   → fix the model
        │
        └─ X/Z in actual fields
               └─► uninitialized RTL or monitor sampling race
                   → check reset coverage and clocking-block usage

Interview angle

A favorite scenario question: "Your scoreboard reports 10,000 mismatches. What do you do?" The expected answer: look only at the first one; check whether the streams fell out of alignment (compare monitor counts) before suspecting data; recompute the first failing case against the spec by hand; and only then open waveforms at the recorded time. Mentioning the mismatch threshold and the don't-care mask shows production experience.

Key takeaways

Encode compare policy as an explicit field mask — don't-cares are spec decisions, not hacks.
Use !== so X/Z propagation fails compares instead of slipping through.
Make the first mismatch report self-sufficient: both txns, differing fields, index, time.
Cap mismatches with a threshold — the first failure is the signal, the cascade is noise.

Common pitfalls

Comparing TB metadata (timestamps, debug ids) — guaranteed false mismatches.
Using == on 4-state fields — X compares as unknown and the mismatch vanishes.
Printing only "expected X got Y" without the full transactions — forces a re-run to debug.
No mismatch cap — one alignment bug produces a gigabyte log and a wedged regression.

Practice this lesson