Row Level Locks Interacting with Gap Locks

Here is a simple example meant to show how a row level lock can conflict wiht a `gap lock` and cause unexpected locking behavour

Create the Base Schema

For this we will use a very basic schema a simple table with a timestmp and condition.

BEGIN;
CREATE TYPE ProcessStep AS ENUM('submitted','completed');

CREATE TABLE IF NOT EXISTS account_account(
       id SERIAL PRIMARY KEY,
       account_id UUID  NOT NULL,
       created_at TIMESTAMP NOT NULL DEFAULT NOW(),
       condition ProcessStep NOT NULL DEFAULT 'submitted'
);
CREATE INDEX account_account_account_id ON account_account (account_id);
COMMIT;

BEGIN
CREATE TYPE
CREATE TABLE
COMMIT

Insert Some Data

We will insert some user data and set the timestamp to be in the past to mimic a system that has had some signup’s in the past

INSERT INTO account_account(account_id,created_at)
SELECT
    gen_random_uuid(),
    now() - ((60*random())::integer * '1 min'::interval)
FROM generate_series(1,100000000);

INSERT 0 100000000

Get Some samples data

We need two data points, one datapoint that has not expired yet, and one that has.

SELECT account_id AS current from account_account WHERE created_at > now() -'30 min'::interval LIMIT 1;
SELECT account_id AS historical from account_account WHERE created_at <now()-'50 min'::interval LIMIT 2;

current
b5e12434-4fa4-4f82-b238-7d254ec90d01
historical
78dae29a-5478-457c-abf2-ee21e744e8ad
0e4f88d1-0261-479b-8edd-70aa91a42b96

Start to process the data ( select a historical record )

We will start to process the data in small batches. First selecting a single row and updating it in a transaction, but don’t commit it just yet

BEGIN;
SELECT * FROM account_account WHERE account_id='78dae29a-5478-457c-abf2-ee21e744e8ad' FOR UPDATE;
-- COMMIT;

Run a ‘healer’ job to collect all the outdata data

Now we are going to run a job that colelcts all accounts that haven’t been modified yet. This should no longer get phil once the prior transaction commits, but it will need to block until that transaction is completed.

BEGIN;
SELECT * from account_account
WHERE created_at < now()-'50 min'::interval
FOR UPDATE
LIMIT 100;

Select a different row in another job

This should block, but for some reason postgres decides to starve the `SELECT *` above and it still services this query

BEGIN;
SELECT * FROM account_account WHERE account_id='0e4f88d1-0261-479b-8edd-70aa91a42b96' FOR UPDATE;

Clear out the first block

Commit the phil transaction, and notice the healer transaction doesn’t actually unblock ( because you are holding koz )

Select another row that should not be blocked by that first query

If this query would satisfy the healer query it may block.

BEGIN;
SELECT * FROM account_account WHERE account_id='b5e12434-4fa4-4f82-b238-7d254ec90d01' FOR UPDATE;
-- COMMIT;

Why not Always Blocking

The lock required for postgres to hold the `FOR UPDATE` or the following `UPDATE` on a tuple is a bit non-intuitive. Postgres has a feature called fast path locking which appears to be what is triggering this behavour. When we need to take the lock on the tuples we don’t issue locks into `pg_locks`. A single RowExclusiveLock gets held until there is contention for a given tuple, at that point the lock gets promoted to the pg_locks table. This can cause interesting behavour

Cleanup

DROP TABLE IF EXISTS account_account;
DROP TYPE IF EXISTS ProcessStep;

DROP TABLE
DROP TYPE

dkhenry/lockcheck.org