Part 10 · Advanced Topics · Intermediate

Profiling Methodology

Build stable baselines, gather simulator profiler evidence, and prioritize fixes by measurable impact.

Why methodology matters

Without a method, tuning becomes guesswork. A clean methodology controls test selection, simulator options, and host environment so deltas are meaningful.

diagram
[PERF] baseline checklist

  use fixed test + seed
  use fixed simulator build and compile flags
  pin host variation where possible
  capture wall time + sim time + memory
  repeat runs to estimate noise band

Minimal baseline script

bash
#!/usr/bin/env bash
set -euo pipefail

TEST="${1:-axi_long_random}"
SEED="${2:-847211}"
LOG="out/logs/${TEST}_seed${SEED}.log"

simv +UVM_TESTNAME="$TEST" +ntb_random_seed="$SEED" \
     +UVM_VERBOSITY=UVM_LOW \
     -l "$LOG"

echo "[PERF] baseline done test=$TEST seed=$SEED log=$LOG"

Collecting profiler evidence

Each simulator exposes profiling options for call-path time and allocation behavior. Use them in dedicated profiling runs, then map expensive functions back to UVM components and callbacks.

bash
# Example patterns (actual flags are tool-specific)
simv +UVM_TESTNAME=axi_long_random +ntb_random_seed=847211 -simprofile time
simv +UVM_TESTNAME=axi_long_random +ntb_random_seed=847211 -simprofile mem

# Keep separate output directories per profile mode
mkdir -p out/prof/time out/prof/mem
diagram
[PERF] hotspot attribution model

  profiler frame -> file/function -> component path -> design intent

  Example:
    uvm_report_server::process_report_message
      -> env.scb.write_act
      -> repeated sprint formatting
      => logging hotspot in scoreboard

Actionable hotspot rules

  • Prioritize top contributors that are frequent and controllable.

  • Ignore tiny contributors until top hotspots are addressed.

  • Tie every fix candidate to one measurable bottleneck.


A/B measurement loop

For each optimization, run A/B comparisons with fixed seed and fixed configuration. Include at least one representative long test and one stress test.

bash
# A build (baseline)
simv_A +UVM_TESTNAME=axi_long_random +ntb_random_seed=847211 -l out/logs/A.log

# B build (candidate optimization)
simv_B +UVM_TESTNAME=axi_long_random +ntb_random_seed=847211 -l out/logs/B.log

# Compare key metrics
python3 tools/perf_compare.py out/logs/A.log out/logs/B.log
python
def pct_delta(old: float, new: float) -> float:
    if old == 0:
        return 0.0
    return ((new - old) / old) * 100.0

def verdict(delta_pct: float) -> str:
    if delta_pct < -5.0:
        return "improved"
    if delta_pct > 5.0:
        return "regressed"
    return "neutral"

Key takeaways

  • Use fixed baselines and repeated runs to reduce noise.

  • Capture profiler evidence before proposing optimizations.

  • Map hotspots to concrete UVM code paths and ownership.

  • Accept optimizations only after A/B improvement is verified.

Common pitfalls

  • Comparing runs with different seeds and drawing strong conclusions.

  • Using only wall-clock time without simulator/profile context.

  • Keeping unverified performance changes because they feel cleaner.