Welcome to Tutorial Tuesday 🎉
This is a quick intro to running a database dump made using pg_basebackup
with Docker. It's intended audience is developers who are interested in exploring a database from a server, perhaps for development, debugging or forensics.
It covers:
- Running
pg_basebackup
- RSyncing this to your local machine
- Configuring the backup to run with required files and setting a password
- How to connect from another program on your local machine
When you grab a pg_basebackup
from a server you have all the databases and roles. You get the whole 'cluster' rather than just the data +schema of your database. This lets you play around, delete/restore, and to some extent gives you ability to restore to a point in time.
In this instance I'm fetching a database cluster from gatherer-demo.gatherdata.io
. To do that we'll ssh into the box and run pg_basebackup
.
This creates a new folder in /tmp
.
~/gatherer_demo_basebackup
❯ ssh [email protected]
Welcome to Ubuntu 22.04 LTS (GNU/Linux 5.15.0-106-generic x86_64)
...
root@gatherer-demo ~ # cd /tmp
root@gatherer-demo /tmp # su postgres
postgres@gatherer-demo:/tmp$ pg_basebackup -D ./basebackup
Next, on our local machine, we'll rsync this over. We're using flags -avz
(archive, verbose, compressed). The z
flag in particular is going to save us a lot of time.
❯ rsync -avz [email protected]:/tmp/basebackup .
That creates a local directory. It took about 2:40 on my machine to rsync, it's 11Gb.
The directory we have, is going to become the data directory for a postgres container.
Before we spin it up, we need to grab 2 files: pg_hba.conf
and postgresql.conf
We can do that by starting a temporary container and copying the files over, then 'rm' the container Here the container happens to have and 'id' of 81b5....
❯ docker run -d -e POSTGRES_PASSWORD=post1234 postgres:14
81b5920e0d0158a2d4e8dd3480b2e827168b5bddf790e2724fbb8c8109722050
~/gatherer_demo_basebackup/basebackup
❯ docker cp 81b5920e0d01:/var/lib/postgresql/data/pg_hba.conf .
Successfully copied 6.66kB to /home/josh/gatherer_demo_basebackup/basebackup.
~/gatherer_demo_basebackup/basebackup
❯ docker cp 81b5920e0d01:/var/lib/postgresql/data/postgresql.conf .
Successfully copied 30.7kB to /home/josh/gatherer_demo_basebackup/basebackup.
~/gatherer_demo_basebackup/basebackup
❯ docker stop 81b5920e0d01 && docker rm 81b5920e0d01
81b5920e0d01
81b5920e0d01
Now we can start the container
docker run -e POSTGRES_PASSWORD=post1234 -P -v .:/var/lib/postgresql/data postgres:14
Your container will have a random name and port in my case it's mystifying_kapitsa
on 32779
❯ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
d27a3f41b17d postgres:14 "docker-entrypoint.s…" 26 minutes ago Up 26 minutes 0.0.0.0:32779->5432/tcp, :::32779->5432/tcp mystifying_kapitsa
Because this container came from a server where postgres
connected over a socket, it has no password. We want to change that
docker exec -it --user postgres mystifying_kapitsa psql
I altered the password with this SQL:
psql (14.12 (Debian 14.12-1.pgdg120+1))
Type "help" for help.
postgres=# ALTER ROLE postgres PASSWORD 'post1234'
I can use the host localhost
, user postgres
, the password post1234
and the port 32779
now to connect from Metabase / Django / Dbeaver etc
❯ psql --host localhost --user postgres --port 32779 --db jf_db
Password for user postgres:
psql (14.8 (Ubuntu 14.8-0ubuntu0.22.10.1), server 14.12 (Debian 14.12-1.pgdg120+1))
Type "help" for help.
jf_db=# select * from auth_user
jf_db-# limit 5;
jf_db=#
One thing to note is that Docker changes ownership of folders when
bind mounts
are used. That means that after running the container you may find that you are no longer the owner of the backup directory.No easy remedy for this as far as I know except to
sudo chown -R
the dir again