Part 4 · Assertions (SVA) · Intermediate

Q&A: Debug & Formal

Triaging an assertion failure at time T, auditing assertions that never fail, sim vs formal semantic differences, assume in formal, and bounded proofs vs liveness.

Q: An assertion fails at time T. Walk me through your debug flow.

Direct answer: triage before diving. Step one, read the assertion as a spec sentence and identify which obligation failed — antecedent matched when? Consequent expected what, when? Step two, locate the attempt start time — for a multi-cycle property the failure time T is the end of the story, not the beginning; the attempt began at T minus the property depth. Step three, examine sampled values over that window in the assertion-debug view (not raw cursor reads). Step four — the fork: decide RTL bug vs assertion bug vs environment bug , in that order of hope but reverse order of prior probability for a new assertion: a freshly written property failing on mature RTL is usually the property (off-by-one, level antecedent, missing disable); a mature property failing on fresh RTL is usually the RTL.

diagram

FAILURE AT TIME T — TRIAGE TREE

  attempt started at T - depth ──► re-read trace from THERE
        │
        ├─ antecedent matched as intended?
        │     NO → retrigger storm / wrong edge / sampling slip
        │           → assertion bug, fix antecedent
        │
        ├─ consequent expectation correct per spec?
        │     NO → off-by-one, wrong operator ([=] vs [->])
        │           → assertion bug, fix consequent
        │
        ├─ reset/abort in window?
        │     YES → missing or wrong-domain disable iff
        │
        └─ all three sound → RTL or stimulus bug:
              walk DUT inputs at attempt start; is the
              VIOLATING stimulus legal? illegal → env bug
              (or missing assume); legal → RTL bug, file it.

Follow-up: 'The failure is intermittent across seeds — what changes?'

Intermittence means a stimulus-shape dependency: overlapping transactions (per-attempt aliasing, local-variable misuse), mid-sim resets hitting in-flight attempts, or boundary timing (latency exactly at the bound). Reproduce with the failing seed, then write a directed cover for the triggering shape so the scenario is pinned in regression forever.

Junior vs senior answer

Junior: opens the waveform at time T and stares at signals.
Senior: rewinds to attempt start, argues from sampled values, and runs the assertion-bug-vs-RTL-bug fork explicitly with priors based on which side is newer.

Q: An assertion has never failed in six months. Suspicious?

Direct answer: yes — silence is a claim that must be audited, because a never-failing assertion is either guarding well or checking nothing, and the dashboard cannot tell the difference. The audit: (1) non-vacuous activity — pull pass-vacuous vs pass-non-vacuous counts; zero non-vacuous = dead antecedent. (2) cover hits — its paired covers, especially boundary covers, must be hitting. (3) fault injection — break the RTL in the way the assertion guards (force a late ack, corrupt a state encoding) and confirm it fires; this is the only positive proof the checker works end to end. (4) history check — was it ever failing pre-silicon? An assertion that caught bugs in bring-up and is quiet now earns more trust than one that has never spoken.

diagram

SILENT ASSERTION AUDIT

  non-vacuous count > 0 ?  ── NO ──► dead antecedent:
        │ YES                        typo / impossible guard /
        ▼                            stimulus never reaches it
  boundary covers hit ?    ── NO ──► bound never stressed:
        │ YES                        add directed test at N
        ▼
  fires on injected bug ?  ── NO ──► checker broken end-to-end
        │ YES                        (wrong signals bound, wrong
        ▼                             clock, $assertoff leak...)
  TRUSTED — silence now means something.

Junior vs senior answer

Junior: 'no failures means it's working'.
Senior: silence is ambiguous between guarding and dead; runs the four-step audit ending in fault injection as ground truth.

Q: Same SVA file in simulation and formal — what behaves differently?

Direct answer: the property semantics are shared; the trace universe and verdict semantics differ. Simulation evaluates attempts along one finite stimulated trace : failures are real but coverage of behavior is only as good as the stimulus, weak pending attempts dissolve at $finish, and X values flow as in 4-state RTL. Formal evaluates over all reachable traces from the initial state under the assumes: a 'pass' is a proof over every legal behavior, liveness and strong operators are fully meaningful on infinite traces, free inputs take all values at once — and a single missing assume means the tool will gleefully use stimulus your testbench would never generate.

diagram

SIMULATION              FORMAL
  trace set           one, finite, driven     all reachable, exhaustive
  PASS means          not violated HERE       cannot be violated (proof)
  weak pending @end   quietly discharged      n/a (infinite traces)
  s_eventually        cannot fail finitely    real verdict (w/ fairness)
  inputs              as driven by TB         free unless assumed
  X handling          4-state propagation     usually 2-state model +
                                              explicit X-injection apps
  failure output      log + waveform          counterexample trace (CEX),
                                              minimal-ish, replayable

Follow-up: 'First thing you check when formal fails a property that sim never failed?'

Whether the counterexample stimulus is legal — nine times out of ten the CEX uses an input sequence the real environment cannot produce, which means a missing or too-weak assume. Validate the CEX against the interface spec before touching RTL; if the stimulus IS legal, formal just earned its license fee.

Junior vs senior answer

Junior: 'formal is exhaustive, sim is not'.
Senior: trace-universe framing, the verdict-semantics differences (weak endings, liveness), free inputs vs driven inputs, and the CEX-legality-first triage habit.

Q: What does assume do in formal, exactly — and what can go wrong?

Direct answer: an assume property restricts the trace universe — the prover only considers behaviors satisfying every assume, so they encode the environment contract (input protocol legality, stable configuration, fairness of external responders). What goes wrong is over-constraint : each unnecessary assume removes legal behaviors from consideration; remove the very behavior containing the bug and the assert 'proves' — a false negative wearing a green checkmark. The pathological end state is contradictory assumes : no trace satisfies them all, every property is vacuously proven, and the entire formal run is meaningless while looking perfect.

systemverilog

// environment contract for a request interface under proof:
m_req_stable:  assume property (@(posedge clk) disable iff (!rst_n)
  (req && !gnt) |=> req);          // requester holds until granted
m_no_x_in:     assume property (@(posedge clk)
  !$isunknown({req, cfg_mode}));
m_cfg_static:  assume property (@(posedge clk) disable iff (!rst_n)
  $stable(cfg_mode));              // config frozen during proof

// the health checks that keep the assumes honest:
c_req_alive:   cover property (@(posedge clk) $rose(req));
c_b2b_alive:   cover property (@(posedge clk)
  gnt ##1 req);                     // interesting traffic still exists?

Follow-up: 'How do you systematically detect over-constraint?'

Cover sanity set: covers for the interesting stimulus classes must remain reachable under the assumes — an unreachable cover is the alarm.
Many tools detect outright assume contradictions (dead-end states) — run that check every session, not once.
Assume-minimization discipline: each assume cites the spec clause licensing it; 'the proof wouldn't converge otherwise' is a finding to escalate, not a justification.
Cross-check in sim: assumes run as checkers there — your own testbench violating an assume means the contract is wrong, not the testbench.

Junior vs senior answer

Junior: 'assume constrains the inputs'.
Senior: trace-universe restriction, over-constraint as silent false proof, contradiction as total vacuity, and the cover-based health regimen.

Q: Bounded proof — what does it actually guarantee? And what do we do with liveness in simulation?

Direct answer: a bounded proof (BMC to depth K) guarantees no counterexample exists within K cycles of the initial state — nothing beyond. A bug whose shortest manifestation needs K+1 cycles (deep FIFO fill, counter wrap, slow-grant arbitration corner) is untouched. Bounded results are evidence with a radius, not proofs; report them as 'clean to depth K' with K stated next to the design's natural depths (FIFO depths, counter periods, timeout values). Full proofs need unbounded techniques — induction, invariants, abstraction — and often strengthening work when induction fails.

And liveness in simulation

Liveness ('eventually X') cannot fail at any finite time, so in simulation it is either converted to bounded form — replace s_eventually with a protocol-derived window ##[1:N], which CAN fail and encodes a real latency requirement — or delegated to formal where infinite traces plus fairness assumptions give 'eventually' real meaning. The standing pattern: bounded version in the sim regression, strong unbounded version in the formal testbench, both generated from the same spec rule.

diagram

rule: "every request is eventually granted"

  sim regression:    $rose(req) |-> ##[1:64] (gnt || !req)
                     64 = arbiter worst case from the spec
                     → CAN fail; runs every night

  formal testbench:  $rose(req) |-> s_eventually (gnt || !req)
                     + fairness assumes on other masters
                     → true liveness; proven once per RTL change

  same plan row, two artifacts, both traceable.

Follow-up: 'Induction failed on your unbounded proof. Now what?'

Failure means the property is not inductive, not that it is false — the tool found an unreachable 'bad' state that one step escapes from.
Add strengthening invariants (helper assertions about the reachable state space: counter ranges, onehot encodings, FIFO occupancy bounds) until the induction closes.
Or reduce the radius honestly: abstraction, case-splitting, or accepting a deep bounded result with the depth documented.

Junior vs senior answer

Junior: 'bounded means it checked K cycles' and 'liveness doesn't work in sim' — both true, neither actionable.
Senior: radius-of-evidence framing with K compared to design depths, the bounded-in-sim/strong-in-formal twin pattern, and the strengthening-invariant response to failed induction.

Key takeaways

Debug from the attempt's START, with sampled values — failure time T is the end of the story.
A never-failing assertion is unverified until fault injection proves it can fire.
Formal pass = proof over all assumed-legal traces; the first triage question for any CEX is whether its stimulus is legal.
Bounded proofs have a radius; liveness lives in formal — simulation gets the protocol-derived bounded twin.

Common pitfalls

Triaging multi-cycle failures by staring at time T — the cause is at attempt start, cycles earlier.
Trusting six months of silence without a mutation test — dead checkers look identical to perfect designs.
Fixing formal CEXs by editing RTL before validating the stimulus was legal — missing assumes masquerade as bugs.
Reporting 'proven' without distinguishing bounded depth-K from full proof — the radius is the result.

Practice this lesson