Skip to content

Instantly share code, notes, and snippets.

@gokhanozdemir
Last active January 14, 2025 17:27
Show Gist options
  • Save gokhanozdemir/82756391ad81538bfdc209ba3989fbba to your computer and use it in GitHub Desktop.
Save gokhanozdemir/82756391ad81538bfdc209ba3989fbba to your computer and use it in GitHub Desktop.
By implementing the provided SQL script, users can proactively address these issues, ensuring their Directus instance remains efficient, organized, and aligned with their evolving content management needs.
-- Directus Database Cleanup Script
-- Purpose: Remove orphaned data and optimize database performance in Directus CMS
-- Tested on: Directus 11.2.2 Postgres 16.0
-- WARNING: Always backup your database before running cleanup scripts!
-- DISCLAIMER: Use at your own risk!
/**
* Background:
* During CMS development, especially with Directus, database structure changes frequently.
* This can lead to:
* - Accumulated orphaned data from deleted collections
* - Unnecessary database bloat from logs and revisions
* - Decreased query performance
* - Storage space issues, especially on free-tier platforms
*
* Prerequisites:
* 1. Create a complete backup of your database
* 2. Test on a non-production environment first
* 3. Consider disabling activity tracking on collections before running
* 4. Ensure Directus is not running during cleanup
* 5. Delete any unused collections before running (if they still exist in the database but not visible on Directus)
* Note: Re-enabling tracking may cause errors (not tested)
*
* Usage Notes:
* - Script removes activity logs and revisions for non-existent collections
* - Performs database optimization after cleanup
* - Particularly useful when approaching storage limits
* Motivation:
I had ~1 GB of SQL data, primarily consisting of logs. After running this script, and the one below my database size reduced to ~300 MB.Yours may vary.
* This effort is inspired from: https://gist.github.com/madc/cee04370227a12127361a615d2ae8678
* Thank you for sharing your sql and inspiring me research more on this topic.@macd
*/
-- Step 1: Remove orphaned activity logs
-- Deletes entries from directus_activity where referenced collections no longer exist
DELETE FROM public.directus_activity
WHERE collection NOT IN (
SELECT tablename
FROM pg_tables
WHERE schemaname = 'public'
);
-- Step 2: Clean revision history
-- Removes revision records for deleted collections
DELETE FROM public.directus_revisions
WHERE collection NOT IN (
SELECT tablename
FROM pg_tables
WHERE schemaname = 'public'
);
-- Step 3: Optimize database
-- Reclaims storage space and updates query planner statistics
VACUUM ANALYZE;
/**
* Post-Execution:
* 1. Verify system functionality
* 2. Monitor system performance
* 3. Consider implementing regular cleanup schedule via flows
*/
@gokhanozdemir
Copy link
Author

DO $$ 
DECLARE
    table_record RECORD;
    index_exists BOOLEAN;
    broken_index_name TEXT;
BEGIN
    -- Loop through all tables in the current schema
    FOR table_record IN 
        SELECT table_schema, table_name 
        FROM information_schema.tables 
        WHERE table_schema = 'public' 
        AND table_type = 'BASE TABLE'
    LOOP
        -- Check if 'id' column exists
        IF EXISTS (
            SELECT 1 
            FROM information_schema.columns 
            WHERE table_schema = table_record.table_schema 
            AND table_name = table_record.table_name 
            AND column_name = 'id'
        ) THEN
            -- Find any existing broken index on id column
            SELECT pi.indexname INTO broken_index_name
            FROM pg_index i
            JOIN pg_class c ON c.oid = i.indrelid
            JOIN pg_class ic ON ic.oid = i.indexrelid
            JOIN pg_indexes pi ON pi.indexname = ic.relname
            WHERE c.relname = table_record.table_name
            AND pi.schemaname = table_record.table_schema
            AND pi.indexdef LIKE '%id%'
            AND i.indisvalid = false;

            IF broken_index_name IS NOT NULL THEN
                -- Drop the broken index
                EXECUTE format('DROP INDEX IF EXISTS %I.%I CASCADE',
                    table_record.table_schema,
                    broken_index_name
                );
                
                RAISE NOTICE 'Dropped broken index % on %.%',
                    broken_index_name,
                    table_record.table_schema,
                    table_record.table_name;

                -- Recreate the index
                EXECUTE format('
                    CREATE INDEX %I ON %I.%I (id)',
                    table_record.table_name || '_id_idx',
                    table_record.table_schema,
                    table_record.table_name
                );

                RAISE NOTICE 'Created new index on id column for %.%',
                    table_record.table_schema,
                    table_record.table_name;

                -- If this is a primary key, also reindex it
                IF EXISTS (
                    SELECT 1 
                    FROM information_schema.table_constraints tc
                    WHERE tc.constraint_type = 'PRIMARY KEY'
                    AND tc.table_schema = table_record.table_schema
                    AND tc.table_name = table_record.table_name
                ) THEN
                    EXECUTE format('REINDEX TABLE %I.%I',
                        table_record.table_schema,
                        table_record.table_name
                    );
                    
                    RAISE NOTICE 'Reindexed primary key for %.%',
                        table_record.table_schema,
                        table_record.table_name;
                END IF;
            END IF;
        END IF;
    END LOOP;
END $$;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment