Part 4 · TLM & Analysis · Intermediate

FIFO Debug Patterns: Stuck Consumer, Empty Queue Hangs, and Runaway Growth

Systematic triage for FIFO-related liveness bugs: empty get hangs, silent producer disconnects, dead consumers, and unbounded queue growth.

Recognizing FIFO failure signatures

Most FIFO bugs are liveness mismatches : data is not flowing as expected, but simulation may not throw compile or connection errors.

diagram
[TLM] common signatures

Signature A: consumer blocked forever on get()
  likely no producer activity or missing connect

Signature B: FIFO used() grows unbounded
  producer active, consumer stalled/dead/too slow

Signature C: intermittent timeout only on stress seeds
  transient occupancy spikes exceed depth or service assumptions

Signature D: no errors, no compares, test ends clean
  consumer thread never started or objection window mis-scoped
  • Always ask: is producer alive, is consumer alive, is wiring alive?

  • FIFO occupancy trend is the fastest discriminant for root-cause direction.

  • Timeout-only failures usually indicate missing liveness instrumentation.


Stuck at get(): empty FIFO hang triage

When consumer blocks at get() forever, investigate upstream path before touching queue internals.

  1. Confirm producer thread is running and reaching put/write calls.

  2. Confirm connect_phase links are present and correct type parameter #(T).

  3. Confirm phase objections keep producer alive long enough to emit transactions.

  4. Add heartbeat logs around producer enqueue and consumer dequeue sites.

  5. Use short deterministic test to isolate first missing enqueue event.

systemverilog
task run_phase(uvm_phase phase);
  bus_txn t;
  forever begin
    `uvm_info("FIFO_DBG", "consumer waiting on get()", UVM_HIGH)
    act_fifo.get(t);
    `uvm_info("FIFO_DBG",
      $sformatf("consumer got id=%0d used=%0d", t.id, act_fifo.used()), UVM_HIGH)
    compare_against_model(t);
  end
endtask
diagram
[TLM] empty-queue hang flow

consumer get() blocked
   │
   ├─ producer logs absent?
   │      -> producer never ran or exited early
   │
   ├─ producer logs present but fifo used stays 0?
   │      -> wiring/connect/type issue
   │
   └─ producer logs and used rises briefly then stops?
          -> producer path terminated or upstream source dried up

Runaway growth: producer outruns consumer

A monotonically increasing used() means enqueue rate exceeds dequeue rate over time or consumer stopped entirely.

systemverilog
task fifo_health_monitor();
  int unsigned max_used;
  forever begin
    #1000ns;
    if (act_fifo.used() > max_used)
      max_used = act_fifo.used();

    `uvm_info("FIFO_HEALTH",
      $sformatf("used=%0d max_used=%0d", act_fifo.used(), max_used),
      UVM_LOW)

    if (act_fifo.used() > 5000)
      `uvm_warning("FIFO_HEALTH", "occupancy exceeded threshold 5000")
  end
endtask
diagram
[TLM] growth diagnosis checklist

if used() rising:
  1) is consumer thread alive?
  2) is consumer blocked on external dependency?
  3) did consumer hit fatal/error and exit silently?
  4) is compare path unexpectedly O(N) per item?
  5) is test workload beyond originally sized depth assumptions?

Typical root causes

  • consumer waiting on second stream that stopped arriving.

  • expensive compare path added recently without throughput review.

  • missing forked consumer thread due to join/disable fork misuse.

  • unbounded analysis_fifo hiding sustained mismatch until memory pressure.


Instrumentation patterns that pay off

diagram
[TLM] recommended instrumentation set

At enqueue:
  - txn id
  - used() after enqueue
  - producer timestamp

At dequeue:
  - txn id
  - used() after dequeue
  - dequeue timestamp

Derived:
  - residence time (dequeue - enqueue)
  - max occupancy per test
  - no-dequeue watchdog timer
systemverilog
class fifo_dbg_ext extends uvm_object;
  `uvm_object_utils(fifo_dbg_ext)
  time enq_ts;
  time deq_ts;
  int unsigned seq_id;
endclass

function void stamp_enqueue(bus_txn t, int unsigned seq_id);
  fifo_dbg_ext ext;
  if (!t.get_extension(ext)) begin
    ext = fifo_dbg_ext::type_id::create("ext");
    t.set_extension(ext);
  end
  ext.enq_ts = $time;
  ext.seq_id = seq_id;
endfunction
diagram
[TLM] debug observability outcome

Without stamps:
  "scoreboard mismatch at 2.3ms" (hard to correlate)

With stamps:
  id=184 enq=1.2ms deq=2.3ms residence=1.1ms
  -> quickly reveals deep queue delay before compare

Watchdog design tips

  • trigger warning if no dequeue for configurable interval while producer active.

  • separate warning threshold from fatal threshold.

  • print last seen enqueue/dequeue IDs in watchdog report.


End-to-end triage playbook

  1. Reproduce with shortest deterministic test preserving symptom.

  2. Enable enqueue/dequeue occupancy logs at moderate verbosity.

  3. Classify symptom: empty-hang, growth, intermittent pressure, or silent inactivity.

  4. Localize first break boundary (producer, connection, queue, consumer).

  5. Fix liveness root cause, then keep health monitors to prevent regression.

diagram
[TLM] one-page FIFO triage map

No data consumed
  ├─ used() == 0 always -> producer/connect issue
  └─ used() > 0         -> consumer issue

used() grows forever
  ├─ consumer dead      -> restart/fix thread lifecycle
  ├─ consumer blocked   -> resolve dependency wait
  └─ consumer too slow  -> optimize, resize, or split workload

intermittent full/drop
  ├─ realistic burst    -> adjust depth/policy
  └─ pathological burst -> debug source behavior

Key takeaways

  • FIFO debug starts with liveness classification, not random print spam.

  • Empty hangs point upstream; growth points downstream.

  • Occupancy, IDs, and timestamps make FIFO failures reproducible and explainable.

  • Keep lightweight health telemetry permanently in complex scoreboards.

Common pitfalls

  • Chasing DUT protocol logic before proving FIFO dataflow health.

  • Adding huge depth to hide growth instead of fixing stalled consumers.

  • Running with zero telemetry and relying only on final test timeout.