Skip to content

Instantly share code, notes, and snippets.

@brwyatt
Last active June 23, 2025 00:24
Show Gist options
  • Save brwyatt/7984b9c94a3d1319731f2c582d2ef006 to your computer and use it in GitHub Desktop.
Save brwyatt/7984b9c94a3d1319731f2c582d2ef006 to your computer and use it in GitHub Desktop.
Diskless Desktop off Ceph RBD root

Diskless Desktop off Ceph RBD root

History/Background

This is the result of a project to PXE-boot a diskless desktop backed by a Ceph cluster. Initially, the plan was to use CephFS for both / (root) and /boot, with /boot accessible by the PXE server to be able to streamline updates to the initrd and kernel from the OS. While /boot is still on CephFS, / was moved to an RBD image, using namespaces for permissions.

Not covered

  • Setting up Ceph (CephFS, RBD, permissions/auth, etc), outside of where it directly interacts with the initrd process.
  • Setting up or configuring PXE
  • LUKS encryption

Assumptions

  • Ubuntu (both as build system and desktop image) - will probably work with other Debian
  • CephFS and Ceph RBD image are setup, configured, and mounted on a system to be used for building
  • PXE is setup (or will be setup separately)
  • iPXE firmware (lots of chainloading here)

iPXE

While not explicitly covered, a brief explaination of the iPXE environment I use:

  • iPXE firmware (either on the NIC itself, or on a USB/SSD on the host)
  • Default boot.iPXE(via TFTP server provided by DHCP server) that chainloads to an HTTP server using $uuid and $mac parameters
  • Webserver (Nginx, etc) that uses UUID and MAC parameters to server the proper iPXE boot script
  • boot script loads kernel and initrd from Webserver using UUID and MAC again, and provides kernel parameters
  • Webserver serves correct kernel and initrd based on UUID and MAC

Tips/recommendations

  • Use RBD for / (and one image per host)
  • Use CephFS for /boot, use the same CephFS for all hosts, so have separate directories for each host
  • Put the iPXE scripts on the CephFS. Don't have to be in the host directories (but could be!)
  • Use namespaces to control access to the RBDs, and use fine-grained permissions
  • Use permissions on the CephFS so hosts (and the PXE server) can only access their own directory

Setup

Paths

These can be anywhere, so set these variables to the actual paths to the mounted drives/partitions.

root="/mnt/root"
boot="/mnt/boot"

Example mounts:

sudo mkdir -p "${root}" "${boot}"

sudo rbd device map POOL/NAMESPACE/IMAGE --name=client.CLIENT_NAME
sudo mount /dev/rbd0 "${root}"

sudo mount -t ceph [email protected]_NAME=/ "${boot}" -o exec

(note: replace the ALL_CAPS values as needed, and make sure /etc/ceph/ceph.conf and /etc/ceph/keyring are setup)

Create the Chroot environment

sudo debootstrap --arch amd64 noble "${root}" http://archive.ubuntu.com/ubuntu

sudo mount -t proc proc "${root}/proc"
sudo mount -t sysfs sysfs "${root}/sys"
sudo mount --bind /dev "${root}/dev"
sudo mount -t devpts pts "${root}/dev/pts"
sudo mount --bind "${boot}" "${root}/boot"

sudo chroot "${root}"

Setting up the image

Install dependencies:

apt install --no-install-recommends initramfs-tools linux-image-generic ceph-common systemd-sysv grub-pc-bin zstd vim acl cryptsetup-initramfs

(note: cryptsetup-initramfs only needed if using LUKS)

Add the files below to their proper places (filename is unimportant):

  • /etc/initramfs-tools/hooks/ceph
  • /etc/initramfs-tools/scripts/local-top/cephboot
    • Make sure to update MACs and {{RBD_IMAGE}} and {{CLIENT_NAME}} at the end

Install/configure Ceph:

  • /etc/ceph/ceph.conf
    • [global] section
      • mon_host (can't use mon_dns_srv_name as DNS doesn't really work
  • /etc/ceph/ceph.conf
    • [client.{{ CLIENT_NAME }}] - make sure this matches your client name!
      • key = {{KEY}} - this is the key for the user for Ceph

Setup /etc/fstab. Make sure to include root and boot:

/dev/rbd0 /       ext4    defaults        0       1
[email protected]_NAME=/PATH /boot   ceph    defaults        0       0

(note: change the / device if using cryptsetup, and make sure the CephFS values (the ALL_CAPS) for /boot are correct)

(Re-)Build the initrd:

update-initramfs -v -c -k "$(ls -t /boot/vmlinuz-* | head -n 1 | sed -r 's|/boot/vmlinuz-||')"

(note: this should (re-)build the initrd for the most recently installed kernel, specify the version explicitly if needed)

Add any users or update/create passwords for users (or root, etc), install anything useful (like ssh, ubuntu-desktop, etc) you'll want in the booted system. Also a good time to update NetPlan config as well, though can probably be done after first boot, too (but may cause some "fun" delays in the boot process without it, but shouldn't prevent boot)

Exit the chroot (^d or exit)

Unmount and boot!

sudo umount "${root}/proc"
sudo umount "${root}/sys"
sudo umount "${root}/dev/pts"
sudo umount "${root}/dev"
sudo umount "${root}/boot"
sudo umount "${root}"

Watch for any errors, and use -l if needed.

Also unmap the RBD device (if you are, in fact, using that here):

sudo rbd device unmap /dev/rbd0

You don't need to unmount CephFS, though, it's okay with concurrent access (RBD WILL have issues if it is mounted in more than one system)

Update (or create!) the iPXE script for the host and double check the root= param matches either /dev/rbd0 or the mapped cryptsetup device /dev/mapper/NAME.

#!/bin/sh
## IN /etc/initramfs-tools/hooks/
prereqs() {
echo ""
}
case "${1}" in
prereqs)
prereqs
exit 0
;;
esac
. /usr/share/initramfs-tools/hook-functions
# Copy the mount.ceph binary
copy_exec /sbin/mount.ceph
copy_exec /usr/bin/rbd
if [ -f "/etc/crypttab" ]; then
# Probably not needed, but also won't hurt. First attempt had
# failed so I added this as it was missing, but other things
# were probably the cause.
copy_file config /etc/crypttab
fi
if [ -d "/etc/ceph" ]; then
copy_file config /etc/ceph/ceph.conf
copy_file config /etc/ceph/keyring
fi
# Make sure the kernel can be read by the PXE web server
setfacl -m u:33:r /boot/vmlinuz-*
exit 0
#!/bin/sh
## IN: /etc/initramfs-tools/scripts/local-top/
case "$1" in
prereqs)
exit 0
;;
esac
error_shell() {
if [ -z "${1}" ]; then
error_shell "Unknown error"
fi
echo '!!!'" Zephyr: ${1}"'!'" Dropping to emergency shell. "'!!!'
/bin/sh
exit 0
}
echo "=== Zephyr: Starting custom initramfs boot logic ==="
echo "=== Zephyr: Configuring network ==="
# I had wanted to try and make this work on multiple hosts,
# so I could more easily test in a VM... but no Bash in the
# initrd, and adding it doesn't work.
#CEPH_MACS=("aa:bb:cc:11:22:33" "ab:bb:cb:11:21:31")
CEPH_MAC="ab:bb:cb:11:21:31"
CEPH_IP="192.168.5.50/24"
#USER_MACS=("aa:bb:cc:11:22:34" "ab:bb:cb:11:21:32")
USER_MAC="ab:bb:cb:11:21:32"
FOUND_CEPH_NIC=""
FOUND_USER_NIC=""
for iface in $(ip -o link show | awk -F': ' '{print $2}'); do
current_mac=$(ip link show "${iface}" | awk '/link\/ether/ {print $2}')
if [ "${FOUND_USER_NIC}" = "" ]; then
#for USER_MAC in "${USER_MACS[@]}"; do
if [ "${current_mac}" = "${USER_MAC}" ]; then
FOUND_USER_NIC="${iface}"
echo "=== Found User NIC: ${FOUND_USER_NIC} ==="
#break
fi
#done
fi
if [ "${FOUND_CEPH_NIC}" = "" ]; then
#for CEPH_MAC in "${CEPH_MACS[@]}"; do
if [ "${current_mac}" = "${CEPH_MAC}" ]; then
FOUND_CEPH_NIC="${iface}"
echo "=== Found Ceph NIC: ${FOUND_CEPH_NIC} ==="
#break
fi
#done
fi
done
echo "=== Configuring ${FOUND_USER_NIC} for User access ==="
ip link set dev "${FOUND_USER_NIC}" up
# Doing DHCP in the initrd seems to cause problems later, as the
# booted system tries to run DHCP, too, and you end up with an autoconf
# address that takes over the default route for some reason.
#dhcpcd "${FOUND_USER_NIC}"
echo "=== Configuring ${FOUND_CEPH_NIC} for Ceph access ==="
ip link set dev "${FOUND_CEPH_NIC}" up
# Static IP, see above for note about DHCP
ip addr add "${CEPH_IP}" dev "${FOUND_CEPH_NIC}"
echo "=== Zephyr: Mapping RBD ==="
rbd device map {{RBD_IMAGE}} --name=client.{{CLIENT_NAME}} || error_shell "Failed to mount Ceph RBD root"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment