pccx Documentation¶
Welcome to the pccx (Parallel Compute Core eXecutor) documentation. pccx is a scalable NPU architecture for accelerating Transformer-based LLMs on edge devices. Select a section from the sidebar to begin.
Ecosystem¶
github.com/pccxai/pccx-FPGA-NPU-LLM-kv260
The active v002 SystemVerilog sources — ISA package, controller, compute cores (GEMM / GEMV / CVO), memory hierarchy. Target device is the Xilinx Kria KV260 (Zynq UltraScale+ ZU5EV).
Currently supported (focus): Gemma-3N E4B @ W4A8KV4 — tok/s pending KV260 board run (see Evidence). Everything else (v003 / Gemma-4 / Llama) lives on the Roadmap.
Every v002 RTL reference page on this site links back to the exact
.sv file in that repository.
github.com/pccxai/pccx — the Sphinx project powering this site.
hkimw.github.io/hkimw — blog, other projects, about.
Tooling & Lab¶
Performance simulator and AI-integrated profiler, built for the pccx NPU. Pre-RTL bottleneck detection, UVM co-simulation, and LLM-driven testbench generation in one workflow.
Work in Progress
Source: github.com/pccxai/pccx-lab
Why pccx-lab is one repo, not five. Phase 1 split the monolith
into a 10-crate Cargo workspace (core, reports,
verification, authoring, evolve, remote, lsp,
uvm_bridge, ai_copilot, ui/src-tauri).
pccx is formally specified in Sail —
the same ISA-semantics language used for RISC-V, Arm,
CHERI, and Morello. The 64-bit / 4-bit-opcode v002 ISA
lives under formal/sail/ in the RTL repo; each SystemVerilog
typedef has a 1:1 Sail counterpart so width drift fails
Sail’s type checker before it fails silicon.
Introduction
v002 Architecture
Target Hardware
pccx-lab Handbook
Archive
Toolchain Demos