Part 3 · Constraint Randomization · Intermediate
Solver Performance
What makes constraints expensive — multiplication, division, wide vectors, cross-object coupling — how slow solves present, and restructuring fixes that work.
Why some constraints are expensive
Constraint solving is NP-hard in general; simulators ship clever engines (interval propagation, SAT/BDD hybrids) that handle typical verification constraints in microseconds — until a constraint lands outside the engine's easy cases. The expensive shapes are predictable: nonlinear arithmetic (multiplication or division between two random variables ), modulo by a random variable , very wide vectors in arithmetic relations, large unique sets , and cross-object coupling that fuses hundreds of variables into one giant problem. Multiplication by a constant is cheap; a * b == c with all three random is factoring — genuinely hard.
class costly;
rand bit [31:0] a, b, c;
rand bit [63:0] wide;
constraint c_hard {
a * b == c; // EXPENSIVE: rand*rand (factoring)
wide % a == 3; // EXPENSIVE: modulo by a rand var
c / b inside {[2:9]}; // EXPENSIVE: rand/rand division
}
constraint c_cheap {
a * 4 == c; // cheap: constant multiply = shift
wide[1:0] == 2'b11; // cheap: bit-slice equality
b inside {[1:100]}; // cheap: interval
}
endclassCONSTRAINT COST LADDER (typical engines)
cheap | equality, inside ranges, bit-slices,
| add/sub, multiply/divide by CONSTANT
--------+---------------------------------------------
medium | implications, foreach over moderate arrays,
| sum() aggregates, small unique sets
--------+---------------------------------------------
costly | rand*rand multiply, rand modulo/divide,
| wide-vector (>64b) arithmetic relations,
| unique over large tightly-ranged sets
--------+---------------------------------------------
blowup | cross-object webs (100s of coupled vars),
| nested implications forming case explosions,
| aggregate equations over huge arrays
Cost is also MULTIPLICATIVE: a medium constraint
inside a foreach over 1000 elements is 1000 mediums
fused into one problem.Recognizing a slow solve
Symptoms: simulation wall-clock time grows while simulated time barely advances; profiles show time inside randomize() rather than evaluation or PLI; a specific test hangs at one sequence's generation loop; or memory climbs during randomization (BDD blowup). Measure before fixing — wrap suspicious calls with a realtime delta and log outliers:
class timed_gen;
rand my_txn t;
task gen_one();
realtime t0, t1;
t0 = $realtime;
if (!t.randomize()) $error("randomize failed");
t1 = $realtime;
// $realtime measures SIM time; for wall-clock use vendor
// profiling (e.g. simulator's solver-profile switches) or
// $system("date +%s%N") bracketing in extreme cases.
if (t1 - t0 > 0) $display("solver consumed sim-time?!");
endtask
endclass
// Practical measurement: every major simulator has a solver
// profile mode that reports per-constraint-block solve time and
// failure/retry counts. Run it FIRST — guessing the hot
// constraint by eye fails routinely. Typical switches live under
// names like "solver profile" / "constraint profile" in the
// vendor docs (kept generic here deliberately).The single most useful vendor-neutral signal is the retry/failure count : a constraint set that frequently fails and re-solves (e.g. randc interactions, soft-constraint backtracking) costs multiples of its nominal solve time.
Restructuring fixes that actually work
1. Precompute in pre_randomize / move work out of the solver
class fix_precompute;
rand bit [31:0] addr;
bit [31:0] base; // state var — frozen input
bit [31:0] region_tbl[8]; // lookup table, filled procedurally
function void pre_randomize();
base = select_base_region(); // arbitrary procedural code
endfunction
constraint c {
// solver sees base as a CONSTANT — cheap interval math
addr inside {[base : base + 32'h0FFF]};
}
endclassAnything computable without the solver — region selection, weight tables, derived parameters — belongs in pre_randomize() or plain test code, entering constraints as frozen state. Lookup tables replace arithmetic: instead of constraining size * stride == span (rand*rand), pick an index and read both legal values from a precomputed table of valid pairs.
2. Narrow fields and replace nonlinear forms
class fix_shapes;
rand bit [7:0] pages; // was bit [31:0] — narrow to need
rand bit [2:0] stride_sel;
bit [15:0] stride_tbl[8] = '{1,2,4,8,16,32,64,128};
constraint c {
// was: stride inside {1,2,4,...} && total == pages * stride
// now: stride is a table lookup; multiply is gone from
// the solver entirely
pages inside {[1:200]};
}
function bit [15:0] stride(); return stride_tbl[stride_sel]; endfunction
endclass3. Split one giant randomize into stages
When a container couples hundreds of variables but the real dependencies are layered, split the solve: randomize configuration first, then randomize payloads with the config frozen as state. You lose only constraints that genuinely span the cut — and those you re-express on the second call via with clauses referencing the now-frozen stage-1 results. Use randomize(subset_var) argument lists to scope a call to specific variables when stages share a class.
Interview angle
What interviewers probe
"Which constraints are expensive?" — expected: rand*rand multiply/divide/modulo, wide-vector arithmetic, big tight unique sets, cross-object webs; constant arithmetic is cheap.
"A test's generation loop got 100x slower after a constraint edit — first step?" — expected: run the simulator's constraint/solver profile, look at per-block time and retry counts; do not guess.
"How do you fix a slow randomize?" — expected: precompute to state vars in pre_randomize, replace nonlinear relations with lookup tables, narrow field widths, split staged randomize calls.
"Why is a*b==c hard but a*4==c easy?" — expected: constant multiply is a linear/shift relation propagated by intervals; rand*rand is factoring — search, not propagation.
Mentioning retry counts as a profiling signal — not just solve time — is a senior tell: it shows you know failed-and-retried solves (soft constraints, randc interplay) are a hidden multiplier.
Key takeaways
Expensive shapes are predictable: rand*rand nonlinear arithmetic, rand modulo, wide vectors, big tight unique sets, cross-object coupling.
Profile before fixing — vendor solver-profile modes report per-block time and retry counts; guessing fails.
Move computable work to pre_randomize state vars and lookup tables — the solver should see constants, not derivations.
Narrow fields to the width the spec needs; width inflates every arithmetic relation's cost.
Split layered problems into staged randomize calls with earlier results frozen as state.
Common pitfalls
Constraining products/quotients of two rand variables when a lookup table of legal pairs would do.
32- or 64-bit fields for quantities that fit in 8 — every relation pays the width tax.
One mega-randomize over a deep object tree when dependencies are actually layered — fuse only what truly couples.
Optimizing by intuition without the solver profile — the hot constraint is rarely the one you suspect.
Ignoring retry counts — heavy soft-constraint or randc backtracking multiplies cost invisibly.