Skip to content

Instantly share code, notes, and snippets.

@revilon1991
Created March 11, 2025 10:39
Show Gist options
  • Save revilon1991/bbde937a9357a83ead514314b25ab72d to your computer and use it in GitHub Desktop.
Save revilon1991/bbde937a9357a83ead514314b25ab72d to your computer and use it in GitHub Desktop.

What is a Page Split and Its Impact on Database Fragmentation

A Page Split occurs in B-trees (the underlying structure of most database indexes) when a new record is inserted into a full data page. To make room, the database engine splits the page into two, redistributes the records, and updates the index pointers.

Why Does a Page Split Cause Fragmentation?

  1. Breaks Physical Order – The new page may be allocated in a different location, disrupting the sequential structure of the data.
  2. Extra Redistribution Costs – Each split requires modifying adjacent pages and potentially updating parent nodes, increasing system load.
  3. Decreased Read Efficiency – Fragmentation forces the database to perform additional I/O operations when reading sequential data.

Impact on Indexes and Performance:

  • High index fragmentation → More disk reads, reducing query performance.
  • Uneven data distribution → More frequent rebuild/reorganize operations.
  • Increased storage usage → Queries consume more memory and cache.

Page Splits are particularly harmful in unordered inserts (unordered PK), as they frequently cause page splits and heavy fragmentation. In ordered inserts (ordered PK), the likelihood of splits is lower because new records are typically appended at the end.

Below are examples of fragmentation under different primary key strategies.

Ordered PK

RDBMS Fragmentation

page capacity 3, inserts - a, b, c, d, e, f, g, h, i, j

page 1 page 2 page 3 page 4 page 5 Event Fragmentation
[a] insert low
[a,b] insert low
[a,b,c] insert low
[a,b] [c,d] insert, page split high
[a,b] [c,d,e] insert high
[a,b] [c,d] [e,f] insert, page split high
[a,b] [c,d] [e,f,g] insert high
[a,b] [c,d] [e,f] [g,h] insert, page split high
[a,b] [c,d] [e,f] [g,h,i] insert high
[a,b] [c,d] [e,f] [g,h] [i,j] insert, page split high

Unordered PK

RDBMS/InnoDB Fragmentation

page capacity 3, inserts - h, g, j, d, b, a, e, c, i, f

page 1 page 2 page 3 page 4 Event Fragmentation
[h] insert low
[g,h] insert low
[g,h,j] insert low
[d,g] [h,j] insert, page split high
[b,d,g] [h,j] insert high
[a,b] [d,g] [h,j] insert, page split high
[a,b,d] [e,g] [h,j] insert high
[a,b] [c,d] [e,g] [h,j] insert, page split high
[a,b] [c,d] [e,g] [h,i,j] insert high
[a,b] [c,d] [e,f,g] [h,i,j] insert high

Ordered PK

InnoDB Fragmentation

page capacity 3, inserts - a, b, c, d, e, f, g, h, i, j

page 1 page 2 page 3 page 4 Event Fragmentation
[a] insert low
[a,b] insert low
[a,b,c] insert low
[a,b,c] [d] insert, new page high
[a,b,c] [d,e] insert high
[a,b,c] [d,e,f] insert, new page low
[a,b,c] [d,e,f] [g] insert low
[a,b,c] [d,e,f] [g,h] insert low
[a,b,c] [d,e,f] [g,h,i] insert low
[a,b,c] [d,e,f] [g,h,i] [j] insert, new page low

References

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment