Skip to content

Instantly share code, notes, and snippets.

@joshbrooks
Created May 14, 2024 00:21
Show Gist options
  • Save joshbrooks/1e6f0018d6834ae4a865b2209cefe154 to your computer and use it in GitHub Desktop.
Save joshbrooks/1e6f0018d6834ae4a865b2209cefe154 to your computer and use it in GitHub Desktop.

Using pg_basebackup and Docker

Welcome to Tutorial Tuesday 🎉

This is a quick intro to running a database dump made using pg_basebackup with Docker. It's intended audience is developers who are interested in exploring a database from a server, perhaps for development, debugging or forensics.

It covers:

  • Running pg_basebackup
  • RSyncing this to your local machine
  • Configuring the backup to run with required files and setting a password
  • How to connect from another program on your local machine

Why you'd want this

When you grab a pg_basebackup from a server you have all the databases and roles. You get the whole 'cluster' rather than just the data +schema of your database. This lets you play around, delete/restore, and to some extent gives you ability to restore to a point in time.

Create the Database Dump

In this instance I'm fetching a database cluster from gatherer-demo.gatherdata.io. To do that we'll ssh into the box and run pg_basebackup. This creates a new folder in /tmp.

~/gatherer_demo_basebackup 
❯ ssh [email protected]
Welcome to Ubuntu 22.04 LTS (GNU/Linux 5.15.0-106-generic x86_64)
...
root@gatherer-demo ~ # cd /tmp
root@gatherer-demo /tmp # su postgres                                                                          
postgres@gatherer-demo:/tmp$ pg_basebackup -D ./basebackup

Rsync to our 'local' machine

Next, on our local machine, we'll rsync this over. We're using flags -avz (archive, verbose, compressed). The z flag in particular is going to save us a lot of time.

❯ rsync -avz [email protected]:/tmp/basebackup .

That creates a local directory. It took about 2:40 on my machine to rsync, it's 11Gb.

Running Docker

The directory we have, is going to become the data directory for a postgres container. Before we spin it up, we need to grab 2 files: pg_hba.conf and postgresql.conf

We can do that by starting a temporary container and copying the files over, then 'rm' the container Here the container happens to have and 'id' of 81b5....

❯ docker run -d -e POSTGRES_PASSWORD=post1234 postgres:14
81b5920e0d0158a2d4e8dd3480b2e827168b5bddf790e2724fbb8c8109722050

~/gatherer_demo_basebackup/basebackup
❯ docker cp 81b5920e0d01:/var/lib/postgresql/data/pg_hba.conf .
Successfully copied 6.66kB to /home/josh/gatherer_demo_basebackup/basebackup.

~/gatherer_demo_basebackup/basebackup
❯ docker cp 81b5920e0d01:/var/lib/postgresql/data/postgresql.conf .
Successfully copied 30.7kB to /home/josh/gatherer_demo_basebackup/basebackup.

~/gatherer_demo_basebackup/basebackup
❯ docker stop 81b5920e0d01 && docker rm 81b5920e0d01
81b5920e0d01
81b5920e0d01

Now we can start the container

docker run -e POSTGRES_PASSWORD=post1234 -P -v .:/var/lib/postgresql/data postgres:14

Your container will have a random name and port in my case it's mystifying_kapitsa on 32779

❯ docker ps
CONTAINER ID   IMAGE                    COMMAND                  CREATED          STATUS          PORTS                                                           NAMES
d27a3f41b17d   postgres:14              "docker-entrypoint.s…"   26 minutes ago   Up 26 minutes   0.0.0.0:32779->5432/tcp, :::32779->5432/tcp                     mystifying_kapitsa

Because this container came from a server where postgres connected over a socket, it has no password. We want to change that

docker exec -it --user postgres mystifying_kapitsa psql

I altered the password with this SQL:

psql (14.12 (Debian 14.12-1.pgdg120+1))
Type "help" for help.

postgres=# ALTER ROLE  postgres PASSWORD 'post1234'

Connecting from the outside world

I can use the host localhost, user postgres, the password post1234 and the port 32779 now to connect from Metabase / Django / Dbeaver etc

❯ psql --host localhost --user postgres --port 32779 --db jf_db
Password for user postgres: 
psql (14.8 (Ubuntu 14.8-0ubuntu0.22.10.1), server 14.12 (Debian 14.12-1.pgdg120+1))
Type "help" for help.

jf_db=# select * from auth_user
jf_db-# limit 5;
jf_db=# 
@joshbrooks
Copy link
Author

One thing to note is that Docker changes ownership of folders when bind mounts are used. That means that after running the container you may find that you are no longer the owner of the backup directory.

No easy remedy for this as far as I know except to sudo chown -R the dir again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment