pccx Documentation

Welcome to the pccx (Parallel Compute Core eXecutor) documentation. pccx is a scalable NPU architecture for accelerating Transformer-based LLMs on edge devices. Select a section from the sidebar to begin.

Ecosystem

RTL Implementation

github.com/pccxai/pccx-FPGA-NPU-LLM-kv260

The active v002 SystemVerilog sources — ISA package, controller, compute cores (GEMM / GEMV / CVO), memory hierarchy. Target device is the Xilinx Kria KV260 (Zynq UltraScale+ ZU5EV).

Currently supported (focus): Gemma-3N E4B @ W4A8KV4 — tok/s pending KV260 board run (see Evidence). Everything else (v003 / Gemma-4 / Llama) lives on the Roadmap.

Every v002 RTL reference page on this site links back to the exact .sv file in that repository.

Open the pccx-FPGA-NPU-LLM-kv260 repository on GitHub
Documentation source

github.com/pccxai/pccx — the Sphinx project powering this site.

Open the pccx documentation repository on GitHub
Author portfolio

hkimw.github.io/hkimw — blog, other projects, about.

Open the hkimw portfolio site

Tooling & Lab

pccx-lab

Performance simulator and AI-integrated profiler, built for the pccx NPU. Pre-RTL bottleneck detection, UVM co-simulation, and LLM-driven testbench generation in one workflow.

Work in Progress

Source: github.com/pccxai/pccx-lab

Open the pccx-lab simulator and profiler
Design rationale

Why pccx-lab is one repo, not five. Phase 1 split the monolith into a 10-crate Cargo workspace (core, reports, verification, authoring, evolve, remote, lsp, uvm_bridge, ai_copilot, ui/src-tauri).

Read the pccx-lab design rationale
Formal model — Sail

pccx is formally specified in Sail — the same ISA-semantics language used for RISC-V, Arm, CHERI, and Morello. The 64-bit / 4-bit-opcode v002 ISA lives under formal/sail/ in the RTL repo; each SystemVerilog typedef has a 1:1 Sail counterpart so width drift fails Sail’s type checker before it fails silicon.

Read the pccx Sail ISA model

v002 Architecture

Target Hardware

pccx-lab Handbook

Archive

Toolchain Demos