pccx-core Module Reference

crates/core/ (pccx-core) is the single sink of the workspace dependency graph. Every other crate — UI, reports, verification, lsp — imports its public types from here. This page documents the role and top-level public items of each module declared with pub mod in lib.rs.

Implementation files live at pccx-lab/crates/core/src/<module>.rs.

live_window

Live-telemetry ring buffer. Modelled on the perf_event_open(2) mmap head/tail ring and the Perfetto SHM producer/consumer API, it bins real NpuTrace events into cycle windows. No synthetic fallbacks: an empty trace yields an empty snapshot (Yuan OSDI 2014 loud-fallback contract). Each LiveSample carries three utilisation ratios — mac_util, dma_bw, stall_pct — and a monotonic ts_ns derived from start_cycle at the pccx v002 reference clock (200 MHz, 5 ns/cycle). The Tauri fetch_live_window IPC command consumes this module to feed the frontend BottomPanel, PerfChart, and Roofline views.

Public items: LiveSample, LiveWindow

mmap_reader

Zero-copy .pccx reader for production-scale (100 MB+) traces. Opens the file with memmap2, parses only the fixed-size header at construction time, and leaves the flat-buffer payload mapped but untouched until a viewport or tile query arrives. This bypasses the multi-second heap allocation that PccxFile::read incurs on large traces. Only the flatbuf encoding is supported (24-byte fixed-stride events); a bincode payload (variable-length) is rejected with an error because binary search cannot be applied to it. Events in the flat buffer must be sorted by start_cycle ascending for viewport binary search to produce correct results — write files intended for MmapTrace using NpuTrace::to_flat_buffer_sorted.

Public items: MmapTrace

step_snapshot

Single-cycle register and MAC-array state snapshot. Given a cached NpuTrace and a target cycle, it reduces the flat event stream to a RegisterSnapshot: per-core active event class and remaining span cycles, plus aggregate MAC/DMA/stall/barrier counts across the whole NPU at that exact cycle. Cycles outside [0, trace.total_cycles] return a deterministic empty snapshot instead of an error, so the UI renders “idle” without error handling. When two events on the same core overlap, the later start_cycle wins (latest-dispatch rule). Zero- duration events fire only on their exact start_cycle and never project into the next cycle (IEEE 1364-2005 §Annex 18 VCD convention).

Public items: step_to_cycle, CoreState, RegisterSnapshot

api_ring

API-integrity ring buffer that records every uca_* driver entry/exit boundary. Following the CUPTI driver-trace pattern, it flushes the aggregate p99 latency and drop count to a fixed-schema row vector that the UI’s API-Integrity panel renders. The ring is populated exclusively from API_CALL events in the .pccx event stream — no synthetic fallback. A trace that carries no API_CALL events returns an empty Vec and logs a warning. The list_api_calls function provides the consumer-facing surface.

Public items: ApiCall, NS_PER_CYCLE (constant: 5 ns/cycle @ 200 MHz)

chrome_trace

Chromium Trace Event Format exporter. Serialises an NpuTrace to a JSON array of Complete Events (ph: "X") that opens directly in ui.perfetto.dev, chrome://tracing, and any Perfetto proto importer. Event categories map to "mac", "dma", "stall", and "sync"; timestamps are converted to integer microseconds using the pccx v002 reference clock (200 cycles = 1 µs). pid maps to the accelerator instance; tid maps to core_id.

Public items: write_chrome_trace, write_chrome_trace_to

isa_replay

ISA-level replay diff engine. Consumes a Spike --log-commits style commit log and emits per-instruction (expected, actual) cycle pairs. Expected cycle counts are looked up from an NPU latency table keyed by mnemonic prefix; actual cycle counts are read from a ;cycles=<N> suffix on each log line. Lines without the suffix treat actual == expected as PASS. Within ±10 % of expected is WARN; outside is FAIL. Unknown mnemonics default to 1 cycle.

Public items: IsaReplayEntry, IsaVerdict (PASS/WARN/FAIL), replay_log

cycle_estimator

Cycle estimation engine for pre-RTL design-space exploration. Given a TileOperation (tiled GEMM M/N/K/bytes_per_element) or AttentionOperation (MQA/GQA parameters), it queries the HardwareModel for MAC array dimensions, AXI bus configuration, and BRAM layout to compute arithmetic cycles, DMA transfer cycles, and stall penalties. Used for Roofline expected-value generation and DSE loops.

Public items: CycleEstimator, TileOperation

vivado_timing

Parser for Vivado report_timing_summary -quiet -no_header text output. Binds the UG906 section headers — “Design Timing Summary”, “Clock Summary”, “Intra Clock Table”, “Timing Details” — into a structured TimingReport. The UI’s SynthStatusCard consumes this parser in place of the regex stub in synth_report.rs. Supported fields include WNS, TNS, failing endpoints, and per-clock-domain period for the KV260 ZU5EV fixture.

Public items: parse_timing_report, parse_worst_endpoint, TimingReport, ClockDomain, FailingPath, TimingParseError

coverage

UVM functional-coverage JSONL run-dump merger. Consumes per-run .jsonl files and aggregates bin hit counts and cross tuples following Accellera UCIS merge semantics (count-based bin summation across runs). The goal field is optional; when absent, the largest goal seen across all runs is carried forward (0 if none was ever supplied). The merged result is rendered by the UI’s VerificationSuite panel with groups below their closure threshold highlighted.

Public items: merge_coverage_jsonl, CovBin, CovGroup, CrossTuple, MergedCoverage, CoverageError

vcd / vcd_writer

IEEE 1364 VCD reader/writer pair.

vcd — Delegates lexing to the vcd crate (MIT) and repackages the output as a flat, serde-serialisable WaveformDump: Vec<SignalMeta> from $scope/$var headers and Vec<VcdChange> from timestamp and value-change lines. The UI consumes this via the parse_vcd_file Tauri command; it binary-searches per signal for O(log n) value-at-tick lookups.

vcd_writer — Generates a spec-legal VCD from an NpuTrace for use in GTKWave, Surfer, Verdi, or the built-in Waveform panel. Emits eight signals: clk, rst_n, mac_busy, dma_rd, dma_wr, stall, barrier, core_id. Timescale 1 ns; IEEE 1364-2005 §18 compliant.

Public items (vcd): parse_vcd_file, WaveformDump, SignalMeta, VcdChange, VcdError

Public items (vcd_writer): write_vcd, write_vcd_to

pccx_format

Binary trace format codec for the .pccx container. File layout (little-endian): magic PCCX (4 bytes) → major/minor version (u8 each) → 2 reserved bytes → JSON header length (u64) → UTF-8 JSON header → binary payload. Current version: MAJOR_VERSION = 0x01, MINOR_VERSION = 0x01. A major-version mismatch returns UnsupportedMajorVersion. For the full format specification, see .pccx Binary Format.

Public items: PccxFile, PccxHeader, PccxError, ArchConfig, TraceConfig, PayloadConfig, fnv1a_64

Cite This Page

@misc{pccx_lab_core_modules_2026,
  title        = {pccx-core module reference: public modules of the pccx-lab Rust core crate},
  author       = {Kim, Hyunwoo},
  year         = {2026},
  howpublished = {\url{https://pccxai.github.io/pccx/en/docs/Lab/core-modules.html}},
  note         = {Part of pccx: \url{https://pccxai.github.io/pccx/}}
}