Part 4 · TLM & Analysis · Intermediate
FIFO Debug Patterns: Stuck Consumer, Empty Queue Hangs, and Runaway Growth
Systematic triage for FIFO-related liveness bugs: empty get hangs, silent producer disconnects, dead consumers, and unbounded queue growth.
Recognizing FIFO failure signatures
Most FIFO bugs are liveness mismatches : data is not flowing as expected, but simulation may not throw compile or connection errors.
[TLM] common signatures
Signature A: consumer blocked forever on get()
likely no producer activity or missing connect
Signature B: FIFO used() grows unbounded
producer active, consumer stalled/dead/too slow
Signature C: intermittent timeout only on stress seeds
transient occupancy spikes exceed depth or service assumptions
Signature D: no errors, no compares, test ends clean
consumer thread never started or objection window mis-scopedAlways ask: is producer alive, is consumer alive, is wiring alive?
FIFO occupancy trend is the fastest discriminant for root-cause direction.
Timeout-only failures usually indicate missing liveness instrumentation.
Stuck at get(): empty FIFO hang triage
When consumer blocks at get() forever, investigate upstream path before touching queue internals.
Confirm producer thread is running and reaching put/write calls.
Confirm connect_phase links are present and correct type parameter #(T).
Confirm phase objections keep producer alive long enough to emit transactions.
Add heartbeat logs around producer enqueue and consumer dequeue sites.
Use short deterministic test to isolate first missing enqueue event.
task run_phase(uvm_phase phase);
bus_txn t;
forever begin
`uvm_info("FIFO_DBG", "consumer waiting on get()", UVM_HIGH)
act_fifo.get(t);
`uvm_info("FIFO_DBG",
$sformatf("consumer got id=%0d used=%0d", t.id, act_fifo.used()), UVM_HIGH)
compare_against_model(t);
end
endtask[TLM] empty-queue hang flow
consumer get() blocked
│
├─ producer logs absent?
│ -> producer never ran or exited early
│
├─ producer logs present but fifo used stays 0?
│ -> wiring/connect/type issue
│
└─ producer logs and used rises briefly then stops?
-> producer path terminated or upstream source dried upRunaway growth: producer outruns consumer
A monotonically increasing used() means enqueue rate exceeds dequeue rate over time or consumer stopped entirely.
task fifo_health_monitor();
int unsigned max_used;
forever begin
#1000ns;
if (act_fifo.used() > max_used)
max_used = act_fifo.used();
`uvm_info("FIFO_HEALTH",
$sformatf("used=%0d max_used=%0d", act_fifo.used(), max_used),
UVM_LOW)
if (act_fifo.used() > 5000)
`uvm_warning("FIFO_HEALTH", "occupancy exceeded threshold 5000")
end
endtask[TLM] growth diagnosis checklist
if used() rising:
1) is consumer thread alive?
2) is consumer blocked on external dependency?
3) did consumer hit fatal/error and exit silently?
4) is compare path unexpectedly O(N) per item?
5) is test workload beyond originally sized depth assumptions?Typical root causes
consumer waiting on second stream that stopped arriving.
expensive compare path added recently without throughput review.
missing forked consumer thread due to join/disable fork misuse.
unbounded analysis_fifo hiding sustained mismatch until memory pressure.
Instrumentation patterns that pay off
[TLM] recommended instrumentation set
At enqueue:
- txn id
- used() after enqueue
- producer timestamp
At dequeue:
- txn id
- used() after dequeue
- dequeue timestamp
Derived:
- residence time (dequeue - enqueue)
- max occupancy per test
- no-dequeue watchdog timerclass fifo_dbg_ext extends uvm_object;
`uvm_object_utils(fifo_dbg_ext)
time enq_ts;
time deq_ts;
int unsigned seq_id;
endclass
function void stamp_enqueue(bus_txn t, int unsigned seq_id);
fifo_dbg_ext ext;
if (!t.get_extension(ext)) begin
ext = fifo_dbg_ext::type_id::create("ext");
t.set_extension(ext);
end
ext.enq_ts = $time;
ext.seq_id = seq_id;
endfunction[TLM] debug observability outcome
Without stamps:
"scoreboard mismatch at 2.3ms" (hard to correlate)
With stamps:
id=184 enq=1.2ms deq=2.3ms residence=1.1ms
-> quickly reveals deep queue delay before compareWatchdog design tips
trigger warning if no dequeue for configurable interval while producer active.
separate warning threshold from fatal threshold.
print last seen enqueue/dequeue IDs in watchdog report.
End-to-end triage playbook
Reproduce with shortest deterministic test preserving symptom.
Enable enqueue/dequeue occupancy logs at moderate verbosity.
Classify symptom: empty-hang, growth, intermittent pressure, or silent inactivity.
Localize first break boundary (producer, connection, queue, consumer).
Fix liveness root cause, then keep health monitors to prevent regression.
[TLM] one-page FIFO triage map
No data consumed
├─ used() == 0 always -> producer/connect issue
└─ used() > 0 -> consumer issue
used() grows forever
├─ consumer dead -> restart/fix thread lifecycle
├─ consumer blocked -> resolve dependency wait
└─ consumer too slow -> optimize, resize, or split workload
intermittent full/drop
├─ realistic burst -> adjust depth/policy
└─ pathological burst -> debug source behaviorKey takeaways
FIFO debug starts with liveness classification, not random print spam.
Empty hangs point upstream; growth points downstream.
Occupancy, IDs, and timestamps make FIFO failures reproducible and explainable.
Keep lightweight health telemetry permanently in complex scoreboards.
Common pitfalls
Chasing DUT protocol logic before proving FIFO dataflow health.
Adding huge depth to hide growth instead of fixing stalled consumers.
Running with zero telemetry and relying only on final test timeout.