Part 10 · Advanced Topics · Intermediate

Profiling Methodology

Build stable baselines, gather simulator profiler evidence, and prioritize fixes by measurable impact.

Why methodology matters

Without a method, tuning becomes guesswork. A clean methodology controls test selection, simulator options, and host environment so deltas are meaningful.

diagram

[PERF] baseline checklist

  use fixed test + seed
  use fixed simulator build and compile flags
  pin host variation where possible
  capture wall time + sim time + memory
  repeat runs to estimate noise band

Minimal baseline script

bash

#!/usr/bin/env bash
set -euo pipefail

TEST="${1:-axi_long_random}"
SEED="${2:-847211}"
LOG="out/logs/${TEST}_seed${SEED}.log"

simv +UVM_TESTNAME="$TEST" +ntb_random_seed="$SEED" \
     +UVM_VERBOSITY=UVM_LOW \
     -l "$LOG"

echo "[PERF] baseline done test=$TEST seed=$SEED log=$LOG"

Collecting profiler evidence

Each simulator exposes profiling options for call-path time and allocation behavior. Use them in dedicated profiling runs, then map expensive functions back to UVM components and callbacks.

bash

# Example patterns (actual flags are tool-specific)
simv +UVM_TESTNAME=axi_long_random +ntb_random_seed=847211 -simprofile time
simv +UVM_TESTNAME=axi_long_random +ntb_random_seed=847211 -simprofile mem

# Keep separate output directories per profile mode
mkdir -p out/prof/time out/prof/mem

diagram

[PERF] hotspot attribution model

  profiler frame -> file/function -> component path -> design intent

  Example:
    uvm_report_server::process_report_message
      -> env.scb.write_act
      -> repeated sprint formatting
      => logging hotspot in scoreboard

Actionable hotspot rules

Prioritize top contributors that are frequent and controllable.
Ignore tiny contributors until top hotspots are addressed.
Tie every fix candidate to one measurable bottleneck.

A/B measurement loop

For each optimization, run A/B comparisons with fixed seed and fixed configuration. Include at least one representative long test and one stress test.

bash

# A build (baseline)
simv_A +UVM_TESTNAME=axi_long_random +ntb_random_seed=847211 -l out/logs/A.log

# B build (candidate optimization)
simv_B +UVM_TESTNAME=axi_long_random +ntb_random_seed=847211 -l out/logs/B.log

# Compare key metrics
python3 tools/perf_compare.py out/logs/A.log out/logs/B.log

python

def pct_delta(old: float, new: float) -> float:
    if old == 0:
        return 0.0
    return ((new - old) / old) * 100.0

def verdict(delta_pct: float) -> str:
    if delta_pct < -5.0:
        return "improved"
    if delta_pct > 5.0:
        return "regressed"
    return "neutral"

Key takeaways

Use fixed baselines and repeated runs to reduce noise.
Capture profiler evidence before proposing optimizations.
Map hotspots to concrete UVM code paths and ownership.
Accept optimizations only after A/B improvement is verified.

Common pitfalls

Comparing runs with different seeds and drawing strong conclusions.
Using only wall-clock time without simulator/profile context.
Keeping unverified performance changes because they feel cleaner.

Practice this lessonQuestions tagged for this topic in the bank