Part 1 · Language Foundations · Intermediate

Clocking Blocks

Input/output skew, 1step sampling semantics, driving through the clocking block, eliminating TB races, and default clocking.

The race a clocking block kills

A testbench that samples a DUT output in an always @(posedge clk) block races the DUT flop driving that output: both processes trigger on the same edge, and the simulator may run either first. Sample first and you read the old value; let the DUT update first and you read the new one — the result flips between simulator versions, seeds, and even unrelated code edits. A clocking block fixes both directions of the race by construction. Inputs are sampled with an input skew — the default #1step means "the value in the Preponed region, immediately before the edge": the stable pre-edge value, untouched by anything the edge triggers. Outputs are driven with an output skew after the edge, landing like real stimulus arriving between clocks — never colliding with the DUT's same-edge sampling.

diagram
CLOCKING BLOCK SAMPLING & DRIVING (1step in, #2ns out)

                     posedge clk
                          │
   ───────────────────────┼──────────────────────► time
            ▲             │        ▲
            │             │        │
     input #1step         │     output #2
     sample HERE:         │     drive HERE:
     Preponed region,     │     2ns after edge,
     value just BEFORE    │     DUT flops see it
     the edge — DUT       │     cleanly at the NEXT
     updates cannot       │     posedge
     contaminate it       │
                          │
     cb.gnt  == pre-edge gnt        cb.req <= v   req changes at edge+2ns

Why 1step and not #0 or #1ns? A skew of 0 would sample in the Observed region after same-edge updates — racy again. A fixed time skew works but couples the TB to a clock period. 1step is defined as one simulation time-precision unit before the edge, which by construction is the last stable value of the previous cycle — period-independent and race-free.


Declaring and using a clocking block

systemverilog
interface bus_if (input logic clk);
  logic        req, gnt;
  logic [31:0] addr;
  logic [63:0] rdata;

  clocking cb @(posedge clk);
    default input #1step output #2ns;
    output req, addr;        // TB drives these (through cb)
    input  gnt, rdata;       // TB samples these (through cb)
  endclocking

  modport tb (clocking cb, input clk);   // export the cb via modport
endinterface

module tb_driver (bus_if.tb bus);
  task automatic one_read(input logic [31:0] a,
                          output logic [63:0] d);
    @(bus.cb);                       // wait for the clocking event
    bus.cb.req  <= 1'b1;             // clocking drive: edge + 2ns
    bus.cb.addr <= a;
    @(bus.cb iff bus.cb.gnt);        // race-free sampled gnt
    d = bus.cb.rdata;                // pre-edge value of rdata
    bus.cb.req  <= 1'b0;
  endtask
endmodule

Note the idioms: @(bus.cb) waits for the clocking event itself (cleaner than @(posedge clk)); reads of cb.gnt always return the sampled pre-edge value, even mid-cycle; and drives must use nonblocking assignments to clocking outputs. Driving the raw signal bus.req directly while a clocking block also drives it reintroduces the race and creates drive conflicts — pick one path per signal and make it the clocking block.


Default clocking and cycle delays

Declaring default clocking cb; in a scope makes the cycle-delay operator ##N mean "N clocking events of cb" — so stimulus reads as cycles rather than raw time. One clocking block per clock domain per interface is the norm; a multi-clock interface declares one block per domain. For interviews, be ready to place the regions: inputs sample in Preponed, the clocking event fires in Observed/Re-NBA ordering, and skewed outputs are scheduled into the future — the precise reason TB-vs-DUT ordering at the same edge stops mattering.

systemverilog
module tb_seq (bus_if.tb bus);
  default clocking @(bus.cb);   // ##N now counts cb events

  initial begin
    ##2;                         // wait 2 clock cycles
    bus.cb.req <= 1'b1;
    ##1 bus.cb.req <= 1'b0;      // one-cycle pulse, race-free
    ##5;
    if (bus.cb.gnt !== 1'b0)
      $error("gnt should have dropped");
  end
endmodule

Interview angle

  • "What does input #1step mean?" — sample in the Preponed region, i.e. the stable value immediately before the clock edge.

  • "How do clocking blocks remove races?" — pre-edge sampling plus post-edge skewed driving means TB and DUT never touch a signal in the same region.

  • "What goes wrong driving the raw signal and the cb signal?" — two drive paths: conflicts and the original race come back.

Key takeaways

  • Input skew #1step samples the pre-edge (Preponed) value — immune to same-edge DUT updates.

  • Output skew drives after the edge, so the DUT samples TB stimulus cleanly at the next edge.

  • Use @(cb) to advance time, cb.sig to sample, cb.sig <= v to drive — one path per signal.

  • default clocking turns ##N into cycle delays, making sequences read in cycles, not nanoseconds.

Common pitfalls

  • Mixing raw-signal drives with clocking-block drives on the same signal — conflicts and resurrected races.

  • Using blocking = on a clocking output — illegal or tool-dependent; clocking drives are nonblocking.

  • Reading bus.gnt (raw) when you meant bus.cb.gnt (sampled) — reintroduces the sampling race silently.

  • Assuming ##N works without a default clocking declaration in scope — it does not.