Part 3 · Constraint Randomization · Intermediate

Wrong Distribution Debugging

Histograms in post_randomize, covergroups on rand fields, implication skew, missing solve-before, eaten soft constraints, and width truncation.

The hardest failure mode: legal but skewed

Distribution bugs are the most insidious randomization failures because nothing fails. Every randomize() returns 1, every value satisfies every constraint, the test passes — and a corner case is generated once per million transactions instead of once per fifty, so coverage never closes and a real RTL bug hides for months. The only way to see a distribution bug is to measure the distribution , which means instrumenting randomization with histograms or coverage before you ever suspect a problem.

The root cause is almost always one of four mechanisms: implication skew (the solver's uniform choice is over solutions, shaped by constraint structure), a missing solve ... before ordering, a soft constraint silently discarded by a conflicting hard constraint, or signed/width truncation distorting the value range before the solver even sees it.


Instrument first: histogram in post_randomize

systemverilog
class txn;
  rand bit        is_err;
  rand bit [3:0]  len;
  constraint err_c { is_err dist { 1 := 1, 0 := 9 }; }  // intend 10% errors

  // --- instrumentation ---
  static int unsigned hist_err[2];
  static int unsigned hist_len[16];

  function void post_randomize();
    hist_err[is_err]++;
    hist_len[len]++;
  endfunction

  static function void report();
    int unsigned total = hist_err[0] + hist_err[1];
    $display("err rate: %0d/%0d = %0.1f%%",
             hist_err[1], total, 100.0 * hist_err[1] / total);
    foreach (hist_len[i])
      $display("len=%0d : %0d", i, hist_len[i]);
  endfunction
endclass

module t;
  initial begin
    txn x = new();
    repeat (10000) assert(x.randomize());
    txn::report();
  end
endmodule

Static counters bumped in post_randomize cost nothing to write and answer the question definitively: run 10,000 randomizations in a tiny standalone loop and print the histogram. If the err rate shows 50% instead of 10%, the dist is being ignored or overridden — and you found it in seconds, not after a regression. For permanent monitoring, the same measurement belongs in a covergroup so the regression coverage report itself reveals skew.

systemverilog
// Permanent variant: covergroup sampled in post_randomize
class txn_cov_wrap;
  txn t = new();
  covergroup cg;
    cp_err: coverpoint t.is_err { bins err = {1}; bins ok = {0}; }
    cp_len: coverpoint t.len {
      bins zero = {0}; bins mid = {[1:14]}; bins max = {15};
    }
    x_err_len: cross cp_err, cp_len;
  endgroup
  function new(); cg = new(); endfunction
  task gen_one(); assert(t.randomize()); cg.sample(); endtask
endclass
// Skew shows up as bins with anomalously low hit ratios in the
// coverage report - visible in every regression, not just when
// someone remembers to look.

Cause 1: implication skew

The solver picks uniformly among solutions of the whole constraint system — not uniformly per variable. Constraint structure therefore shapes marginal distributions in non-obvious ways. The classic case is an implication where one branch admits vastly more solutions than the other.

systemverilog
class skewed;
  rand bit        small;
  rand bit [9:0]  val;            // 0..1023
  constraint c { small -> val < 4; }
endclass
// Naive expectation: small is a bit, so ~50/50.
// Reality: count the solution space.
//   small=1 : val in {0,1,2,3}       ->    4 solutions
//   small=0 : val in {0..1023}       -> 1024 solutions
//   total 1028; P(small=1) = 4/1028 ~ 0.4%
// The implication did not "set" anything - it carved the solution
// space so that small=1 is 256x rarer than small=0.

class fixed;
  rand bit        small;
  rand bit [9:0]  val;
  constraint c { small -> val < 4; }
  constraint order { solve small before val; }
  // Now: small chosen first ~50/50 among ITS values,
  // then val solved within the chosen branch.
endclass

solve small before val changes probability, not legality: the solution set is identical, but the solver first picks small (roughly uniformly over its feasible values) and then solves val given that choice. This restores the intended ~50/50. The interview-grade explanation is the counting argument above — being able to compute 4/1028 on a whiteboard is exactly what separates “I know solve-before exists” from “I understand the solver.” Note that randc variables and dist-constrained variables are implicitly solved early/independently in ways that interact with solve-before; keep orderings minimal and purposeful.


Causes 2-4: eaten soft, dist vs ==, width/sign truncation

systemverilog
// CAUSE: soft constraint silently discarded
class cfg;
  rand bit [7:0] qdepth;
  constraint typical_c { soft qdepth == 8; }      // intended default
endclass
// Elsewhere a test adds:  assert(c.randomize() with { qdepth > 4; });
// qdepth > 4 is hard and CONSISTENT with qdepth==8... soft survives, fine.
// But:                    assert(c.randomize() with { qdepth > 8; });
// now soft qdepth==8 CONFLICTS -> dropped ENTIRELY -> qdepth uniform
// in [9:255]. Nobody is told. Histogram shows mean ~132, not 8-ish.

// CAUSE: dist is ignored for variables that are also pinned
class d1;
  rand bit [3:0] x;
  constraint c1 { x dist { 0 := 8, [1:15] :/ 2 }; }
endclass
// elsewhere: d.randomize() with { x == 5; }  -> dist irrelevant (single
// solution). dist weights only shape choice when alternatives exist.

// CAUSE: width/sign truncation distorts the range
class w1;
  rand bit [3:0] len;                  // 4 bits: 0..15
  constraint c { len inside {[8:20]}; }
  // [16:20] is unrepresentable in 4 bits; effective range is [8:15].
  // Histogram shows nothing above 15 - "wrong distribution" that is
  // really a width bug. Signed/unsigned mixes cause similar surprises:
  rand byte signed off;
  constraint c2 { off < 10; }          // includes -128..9, half "negative"
endclass

Each of these produces a healthy return value and legal-looking fields. The eaten-soft case is the most dangerous in layered testbenches: the base class author believes the default holds; the test author believes their override is narrow; the discard is total and silent. The histogram is the only witness.


Worked example: skewed dist plus the fix

systemverilog
// SYMPTOM: error-injection rate measured at ~0.1%, intended 10%.
class bus_txn;
  rand bit        err;
  rand bit [7:0]  addr;
  constraint err_rate_c { err dist { 1 := 1, 0 := 9 }; }
  constraint err_addr_c { err -> addr == 8'hFF; }   // errors hit one addr
endclass
// DIAGNOSIS by counting (the dist competes with implication structure):
//   err=1 admits 1 addr value; err=0 admits 256.
//   Solvers weight the JOINT space; dist on err is fighting a
//   1-vs-256 solution-count imbalance, and tool behavior here
//   varies - measured rate collapses far below 10%.
// FIX: decouple the choice of err from addr's solution count.
class bus_txn_fixed;
  rand bit        err;
  rand bit [7:0]  addr;
  constraint err_rate_c { err dist { 1 := 1, 0 := 9 }; }
  constraint err_addr_c { err -> addr == 8'hFF; }
  constraint order_c    { solve err before addr; }
  // err picked first honoring its dist (10%), then addr solved
  // in the chosen branch. Histogram confirms ~10%.
endclass

Interview angle

Distribution questions are where interviewers separate tiers. Tier 1 knows dist syntax (:= vs :/). Tier 2 can explain why “small -> val < 4” makes small almost never 1, with the solution-counting argument. Tier 3 knows the operational discipline: never trust a distribution you have not histogrammed; put a covergroup on rand fields in every long-lived bench; know that soft constraints are dropped wholesale on conflict; and recognize width truncation masquerading as skew. Deliver the counting argument with actual numbers — 4 versus 1024 — and the solve-before fix, and you have answered at tier 3.

Key takeaways

  • Distribution bugs never fail a test — only a histogram or covergroup on rand fields exposes them.

  • The solver is uniform over joint solutions, not per variable — implications skew marginals by solution count.

  • solve...before changes probability (pick order), never legality (solution set).

  • A conflicting soft constraint is discarded entirely and silently — the histogram is the only witness.

  • Check declared widths and signedness before blaming the solver — unrepresentable ranges silently clip.

Common pitfalls

  • Trusting dist weights without measuring — implications and solution-count imbalance can swamp them.

  • Adding solve...before everywhere 'for safety' — it reshapes distributions and can hide the real bug.

  • Putting a hard == where a dist or soft default belongs — kills all variation downstream.

  • Constraint ranges exceeding the declared bit width — the clipped histogram looks like solver bias.

  • Histogramming in one-off debug code only — make it a covergroup so regressions watch it forever.