Skip to content

Instantly share code, notes, and snippets.

@mheler
Forked from ctrahey/README.md
Created March 22, 2021 18:44
Show Gist options
  • Save mheler/f7032337a07f3850e5bc34cd9885e3a3 to your computer and use it in GitHub Desktop.
Save mheler/f7032337a07f3850e5bc34cd9885e3a3 to your computer and use it in GitHub Desktop.
Cleanup Ceph Disks from cluster.yaml

Cleanup Ceph Disks

This script is designed to work with yq to take arguments directly from a CephCluster CRD (cluster.yaml) and zap all matching disks on the hosts.

Caveats/Assumptions:

Danger: This is (currently) a wildly destructive script.

  1. Requires named hosts with specific config
  2. Assumes devicePathFilter
  3. Assumes devicePathFilter regex treats /dev/disks/by-path as a base
  4. !!DANGER!! Assumes that your filters DO NOT select ANY disks that aren't for Ceph.

To expand on #4 - you may be relying on the feature of Ceph (or Rook?) which safely avoids disks that have filesystems on them, which allows you to use a broader device filter in your rook cluster.yaml. If you do rely on this, this script will RUIN your system and DESTROY YOUR DATA. If you don't 100% follow what I'm saying, DO NOT use this script. @todo: Check for Ceph remnants on disk before executing the cleanup

These assumptions are not necessary, they just fit with my environment when I needed this script.

This is what my cluster.yaml spec.storage.nodes looks like, for reference:

    nodes:
    - name: "dl380p-g8-01"
      devicePathFilter: "pci-0000:02:00.0-sas-|nvme-1"
    - name: "dl380p-g8-02"
      devicePathFilter: "pci-0000:0a:00.0-sas-exp0x500a098000d7223f-|nvme-1"
    - name: "dl380p-g8-03"
      devicePathFilter: "pci-0000:02:00.0-sas-0x3001438025a76544-lun-|nvme-1"
    - name: "dl380p-g8-04"
      devicePathFilter: "pci-0000:02:00.0-sas-0x5"

Usage

Base script:

cleanup.sh <hostname> <devicePathFilter>

With yq to pull direct from cluster.yaml

cat cluster.yaml | yq e '.spec.storage.nodes[]| .name + " " + .devicePathFilter' - | xargs -n2 ./cleanup.sh

#!/usr/bin/env bash
HOST=$1
# Quote the pattern, as it may include e.g. pipe | chars
QUOTED_PATTERN=$(cat <<EOF
'$2'
EOF
)
echo "Connecting to $HOST with pattern $RAW_PATTERN"
ssh -o ConnectTimeout=2 $HOST sudo PATTERN=$QUOTED_PATTERN bash <<'EOF'
echo "Running on $(hostname) with $PATTERN"
for f in /dev/disk/by-path/*
do
[[ $f =~ $PATTERN ]]
FILEMATCH="${BASH_REMATCH[0]}"
if [ -n "$FILEMATCH" ]; then
DISK=$(realpath $f)
[[ $DISK =~ /dev/(.+) ]]
DEVNAME="${BASH_REMATCH[1]}"
ROTATING=$(cat /sys/block/${DEVNAME}/queue/rotational)
echo "Running cleaning operations on $DISK"
sgdisk --zap-all $DISK
if [[ $ROTATING -eq 1 ]]; then
dd if=/dev/zero of="$DISK" bs=1M count=100 oflag=direct,dsync
else
blkdiscard $DISK
fi
echo "Done"
fi
done
echo "Running host level cleaning"
ls /dev/mapper/ceph-* | xargs -I% -- dmsetup remove %
rm -rf /dev/ceph-*
EOF
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment