The Problem
CSS page-break rules are a blunt instrument. break-inside: avoid works for simple cases but fails catastrophically when you need to paginate complex, multi-section documents: rulebooks with variant-specific content blocks, tables that span multiple pages, headings that must stay grouped with their following content, and appendices that need distinct formatting.
Every existing approach I tried (wkhtmltopdf with CSS, Puppeteer with @page rules, even commercial PDF libraries) produced documents with orphaned headings, awkward page breaks mid-paragraph, and inconsistent spacing. The fundamental issue: they make pagination decisions without measuring actual rendered content.
The Approach
The engine works by measuring before deciding. It renders content into a hidden DOM container at exact print dimensions, measures every element's actual height (including margins, padding, and line-wrapping effects), then assembles pages by packing elements into fixed-height frames with explicit rules about what can and cannot be separated.
- Atomic grouping — headings are bound to their following content; a heading never appears alone at the bottom of a page
- Minimum fill enforcement — a page must be at least 30% full or its content gets pulled back to the previous page
- Section-aware composition — cover pages, content sections, appendices, and back-covers each have independent layout rules
- Variant multiplication — one content source produces 100+ distinct PDFs, each with variant-specific rules, piece definitions, and setup diagrams
How It Works
The pipeline runs in three stages:
- Render and measure — inject HTML into a Puppeteer page at A4 dimensions, walk the DOM tree, record each element's computed height
- Pack pages — iterate elements, accumulating into the current page until adding the next element would overflow. Apply atomic-group rules: if a heading is the last element on a page, pull it forward to the next page along with its content block
- Emit — render each assembled page to PDF via Puppeteer, concatenate with pdfunite
The result is deterministic. Same input always produces same pagination. No manual page-break annotations needed in the source content.
Results
The engine now produces the complete Moddable Chess rulebook library: 70 variant rulebooks assembled from structured game data. Each rulebook includes cover art, rules sections, piece movement diagrams, setup instructions, and an appendix. Zero orphaned elements across 190+ pages of output.