Part 8 · Senior & Interview Prep · Intermediate

Milestones & Progress Metrics

Bring-up to feature-complete to closure to bug-rate convergence, metric dashboards, and the ways metrics lie.

The four-phase verification timeline

Verification projects move through recognizable phases, each with its own exit criterion . Knowing which phase you are in tells you what to measure: counting coverage during bring-up is noise; counting tests written during closure is noise.

diagram

VERIFICATION TIMELINE

  BRING-UP          FEATURE-COMPLETE      COVERAGE CLOSURE    CONVERGENCE
  ─────────────────►─────────────────────►───────────────────►────────────►
  first txn         every plan row has    merged coverage     bug rate
  through DUT,      a test exercising     hits goal, holes    flat-lines
  smoke passes      it at least once      analyzed/waived     near zero
                                                              
  exit: smoke       exit: 100% plan       exit: coverage      exit: N weeks
  green on every    rows "passing"        goal + hole         clean regression
  checkin                                 review done         + bug review

  bugs/week:   ▁▃▅█ ramps up      █▆▅▃ peaks then falls    ▂▁▁ tail
               (TB bugs first,    (random finds the         (each late bug
                then DUT bugs)     weird ones)               is a red flag)

What each phase is for

Bring-up — get one transaction end to end; fix TB bugs; establish the regression skeleton. Days to a couple of weeks.
Feature-complete — write tests/sequences until every plan row is exercised. Bug rate climbs — that is success, not failure.
Coverage closure — seeds and directed fills until merged functional coverage hits goal; every hole gets a written disposition.
Bug-rate convergence — regression soak. The exit isn't a date; it is consecutive clean weeks at full regression volume.

The weekly dashboard

Four numbers, tracked weekly, tell the real story — and the trend matters more than the value . A flat 70% coverage for three weeks is a louder alarm than a 50% that climbed from 30%.

diagram

WEEKLY METRICS DASHBOARD

  Metric            │ Week 6 │ Week 7 │ Week 8 │ Read it as...
  ──────────────────┼────────┼────────┼────────┼──────────────────────────
  Regression pass % │  91%   │  96%   │  94%   │ health of TB + RTL now
  Functional cov %  │  61%   │  72%   │  74%   │ progress toward closure
  (merged)          │        │        │        │ slope flattening → holes?
  Open bugs (P1/P2) │  9/14  │  6/16  │  3/11  │ P1 trend gates milestones
  Code churn        │  high  │  med   │  med   │ RTL still moving →
  (RTL commits/wk)  │        │        │        │ coverage results are stale

  Plus per-phase extras:
  - bring-up: tests passing smoke
  - closure:  uncovered bins by feature area, seeds/night
  - convergence: weeks since last P1, regression runtime

When metrics lie

Every metric on the dashboard can be gamed — usually accidentally. A senior engineer reads metrics adversarially : for each number, ask what failure mode would still produce this value.

100% pass rate lies when checkers are weak — a test with no scoreboard passes everything. Audit: inject a bug, confirm a test fails.
Coverage % lies when bins are coarse — one bin per feature closes fast and proves little. Audit during hole review, not after.
Coverage % also lies when sampling is wrong — sampling during reset or on driver intent inflates hits on scenarios the DUT never saw.
Falling bug rate lies when stimulus is stale — random with converged constraints stops exploring. Check new-seed coverage delta, not just totals.
Pass rate lies when failing tests get quietly disabled — track the test count alongside the percentage.

diagram

METRIC vs REALITY

  Dashboard says           Could actually mean
  ──────────────────────── ─────────────────────────────────────
  pass rate 100%       →   checkers too weak to fail
  coverage 95%         →   trivial bins; real corners unmeasured
  bug rate ~0          →   stimulus exhausted, not design clean
  churn low            →   designers stopped committing... 
                           or stopped fixing

  Defense: pair every metric with its audit
  pass rate   ↔ bug-injection spot checks
  coverage    ↔ bin-quality review in hole analysis
  bug rate    ↔ stimulus freshness (new seeds finding new bins?)

Key takeaways

Four phases — bring-up, feature-complete, closure, convergence — each with its own exit criterion.
Track pass rate, merged coverage, open bugs, and churn weekly; read trends, not snapshots.
Sign-off needs converged bug rate AND closed coverage — either alone is insufficient.
Read every metric adversarially: what failure would still produce this number?

Common pitfalls

Celebrating 100% pass rate without checker audits — silence is not correctness.
Reporting single-seed coverage instead of merged regression coverage.
Treating a falling bug rate as convergence while constraints have stopped exploring.
Letting milestone dates, not exit criteria, declare a phase done.

Practice this lesson