Skip to content

Instantly share code, notes, and snippets.

@duboisf
Created May 5, 2026 02:18
Show Gist options
  • Select an option

  • Save duboisf/2135158bf053885fd4a67cce2c6d7077 to your computer and use it in GitHub Desktop.

Select an option

Save duboisf/2135158bf053885fd4a67cce2c6d7077 to your computer and use it in GitHub Desktop.

Ryzen AI NPU + FastFlowLM on Fedora — Reproducible Recipe

Bringing up the AMD XDNA2 NPU (Ryzen AI 7 PRO 350, 8 columns) on Fedora 44 and running FastFlowLM from source. Tested 2026-05-04.

The Fedora-specific friction is that XRT and the XDNA userspace shim are not packaged in Fedora repos — both have to be built from source. Once those exist at /opt/xilinx/xrt/, FastFlowLM's standard CMake preset works.


Hardware / starting point

  • AMD Ryzen AI 7 PRO 350 w/ Radeon 860M (Krackan, XDNA2 NPU 8 columns)
  • Fedora 44 Workstation (kernel 6.19.x at start)

Target state

  • Kernel ≥ 7.0 with in-tree amdxdna that supports protocol-7 firmware (npu_7.sbin) — needed for NPU firmware 1.1.2.64
  • XRT 2.23.0 at /opt/xilinx/xrt/
  • XDNA shim plugin at /opt/xilinx/xrt/amdxdna/
  • flm 0.9.41 at /usr/local/bin/flm
  • Memlock unlimited (set via systemd, not limits.conf — see step 3)

1. Get a kernel with the new amdxdna driver

The Fedora 44 stock kernel (6.19.x) ships an older amdxdna that only speaks the legacy firmware protocol; loading firmware 1.1.x makes the NPU disappear. Kernel 7.0+ has the protocol-7 path (npu_7.sbin).

Pull the F45 kernel onto F44 (partial upgrade — kernel only):

sudo dnf upgrade kernel --releasever=45

Reboot. The default GRUB menu may be hidden; hold Esc during boot to pick a kernel manually if needed. Old kernels remain installed.

Verify:

uname -r                           # should print 7.x.x
modinfo amdxdna | grep firmware    # should list both npu.sbin AND npu_7.sbin
ls /dev/accel/accel0               # should exist

To always show the GRUB menu (recommended, gives you a fallback if the new kernel misbehaves):

sudo grub2-editenv - unset menu_auto_hide

To revert to the old kernel later:

sudo grubby --set-default=/boot/vmlinuz-6.19.14-300.fc44.x86_64

2. Install build dependencies

Combined dep list across XRT, the XDNA shim, and FastFlowLM. Fedora package names:

sudo dnf install -y \
    gcc gcc-c++ make cmake ninja-build git pkgconf-pkg-config \
    nasm patchelf rust cargo \
    boost-devel boost-program-options \
    libcurl-devel libdrm-devel fftw-devel \
    readline-devel libuuid-devel \
    ffmpeg-free-devel \
    opencl-headers ocl-icd-devel \
    rapidjson-devel \
    protobuf-devel protobuf-compiler \
    elfutils-libelf-devel systemd-devel \
    openssl-devel \
    systemtap-sdt-devel \
    libstdc++-static glibc-static libxcrypt-static \
    rpm-build \
    python3-pybind11

XRT's build script (build.sh) calls cmake3 rather than cmake. On Fedora the package doesn't exist — symlink:

sudo ln -s /usr/bin/cmake /usr/local/bin/cmake3

3. Raise the memlock limit (the systemd way)

NPU execution needs much more than the default 8 MB. On Fedora with GDM, /etc/security/limits.d/ does NOT propagate to graphical-session shells because pam_limits.so is bypassed. Use systemd config instead:

sudo mkdir -p /etc/systemd/system.conf.d /etc/systemd/user.conf.d

sudo tee /etc/systemd/system.conf.d/memlock.conf >/dev/null <<'EOF'
[Manager]
DefaultLimitMEMLOCK=infinity
EOF

sudo tee /etc/systemd/user.conf.d/memlock.conf >/dev/null <<'EOF'
[Manager]
DefaultLimitMEMLOCK=infinity
EOF

Reboot for it to take effect (systemctl daemon-reexec plus full GNOME logout might work, but reboot is reliable). Verify with ulimit -l — should print unlimited.

4. Build & install XRT

git clone https://github.com/Xilinx/XRT.git ~/git/XRT
cd ~/git/XRT/build
./build.sh -npu -opt

Produces three RPMs in Release/:

sudo dnf install -y \
    ~/git/XRT/build/Release/xrt_*-x86_64-base.rpm \
    ~/git/XRT/build/Release/xrt_*-x86_64-base-devel.rpm \
    ~/git/XRT/build/Release/xrt_*-x86_64-npu.rpm

XRT installs to /opt/xilinx/xrt/ but doesn't drop a ld.so.conf.d entry (its model is source /opt/xilinx/xrt/setup.sh). For binaries that link libxrt_coreutil.so.2 system-wide:

echo "/opt/xilinx/xrt/lib" | sudo tee /etc/ld.so.conf.d/xrt.conf
sudo ldconfig

Sanity check:

/opt/xilinx/xrt/bin/xrt-smi examine

At this point xrt-smi will report 0 devices found — that's expected until step 5.

5. Build & install the XDNA shim plugin

XRT itself doesn't include the userspace bridge to /dev/accel/accel0. That lives in AMD's xdna-driver repo (analogous to Arch's xrt-plugin-amdxdna package). The kernel module from this repo is not needed — we already have it in-tree.

git clone --recursive https://github.com/amd/xdna-driver.git ~/git/xdna-driver
cd ~/git/xdna-driver/build
./build.sh -release -nokmod

Build produces Release/xrt_plugin.<ver>_<rel>-x86_64-amdxdna.rpm:

sudo dnf install -y ~/git/xdna-driver/build/Release/xrt_plugin.*-amdxdna.rpm

Now xrt-smi examine should list the device:

|BDF             |Name          |Architecture  |Topology  |
|[0000:c5:00.1]  |RyzenAI-npu6  |aie2p         |6x8       |

6. Build & install FastFlowLM

git clone --recursive https://github.com/FastFlowLM/FastFlowLM.git ~/git/FastFlowLM
cd ~/git/FastFlowLM/src
cmake --preset linux-default
cmake --build build -j$(nproc)
sudo cmake --install build

Installs flm to /opt/fastflowlm/ and symlinks /usr/local/bin/flm.

7. Verify

flm validate

Expected output (all green):

[Linux]  Kernel: 7.x.x
[Linux]  NPU: /dev/accel/accel0 with 8 columns
[Linux]  NPU FW Version: 1.1.2.64
[Linux]  amdxdna version: 0.8
[Linux]  Memlock Limit: infinity

End-to-end inference test:

flm pull llama3.2:1b
flm serve llama3.2:1b &
sleep 5
curl -s -X POST http://127.0.0.1:52625/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"llama3.2:1b","messages":[{"role":"user","content":"hi"}],"max_tokens":20}' \
  | python3 -m json.tool
pkill -f "flm serve"

Reference numbers on this hardware: ~80 tok/s prefill, ~72 tok/s decode on Llama-3.2-1B, TTFT ~570 ms.


Troubleshooting log (issues hit during initial setup)

Symptom Root cause Fix
flm validate reports memlock 8 MB after editing /etc/security/limits.d/ GDM bypasses pam_limits.so Use /etc/systemd/{system,user}.conf.d/memlock.conf instead
XRT build.sh: command not found cmake3 Fedora ships only cmake sudo ln -s /usr/bin/cmake /usr/local/bin/cmake3
XRT configure: OpenCL not found Missing opencl-headers ocl-icd-devel Install them
XRT configure: RapidJSON config not found CMake 4.x stricter, plus rapidjson-devel not installed Install rapidjson-devel
XRT compile: sys/sdt.h: No such file Missing SystemTap dtrace headers sudo dnf install systemtap-sdt-devel
XRT link: cannot find -lstdc++ / -lm / -lc (static) aiebu-asm links statically sudo dnf install libstdc++-static glibc-static
XRT package step: rpmbuild not found rpm-build not installed sudo dnf install rpm-build
flm: libxrt_coreutil.so.2: cannot open shared object file /opt/xilinx/xrt/lib not on loader path Drop /etc/ld.so.conf.d/xrt.conf + ldconfig
flm validate works but flm serve errors No such device with index '0' XDNA shim plugin missing Build & install xdna-driver (step 5)
flm validate reports model_list.json not found Running from build dir sudo cmake --install build, run /usr/local/bin/flm instead

Cleanup notes

  • ~/git/XRT build tree is several GB; git clean -ffdx reclaims most of it while preserving sources.
  • Once you're confident in the new kernel, the older F44 kernels can be removed via sudo dnf remove kernel-core-<ver> kernel-modules-<ver>. Keep at least one fallback.
  • The systemd memlock config and /etc/ld.so.conf.d/xrt.conf survive reboots and package updates.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment