Introduce Quasicryth, a text compressor using Fibonacci quasicrystal tilings for phrase-level compression. Prove that aperiodic structures never structurally collapse at depth, enabling compression at arbitrary hierarchy levels. Bridge quasicrystal mathematics with practical compression, achieving 22.59% ratio on enwik9 with unbounded scaling advantages.
Traditional periodic tilings used in hierarchical compression have a fundamental limitation: they collapse structurally within O(log p) levels, where p is the period. Beyond that depth, one tile type vanishes, making further compression impossible.
Example: Binary hierarchies collapse at depth log₂(W) (word vocabulary size). The codebook loses structural variety, forcing fallback to escapes.
Fibonacci quasicrystals offer a mathematically-proven alternative: structures that never collapse at any depth, enabling compression at arbitrarily deep hierarchy levels.
Main Theorem (Aperiodic Hierarchy Advantage): The Fibonacci tiling uniquely satisfies five properties simultaneously:
Non-Collapse at All Depths: Both tile types persist indefinitely
Scale-Invariant Coverage: Potential word coverage remains constant across hierarchy levels
Maximum Codebook Efficiency: Exactly Fm+1 distinct patterns at level m
Bounded Overhead: Flag entropy capped at 1/φ ≈ 0.618 bits/word
Strict Entropy Advantage: Lower per-word coding entropy than periodic alternatives for long-range sources
Mathematical Framework:
Substitution Matrix Analysis:
σ: L → LS, S → L (Fibonacci substitution)
Eigenvalue = φ (golden ratio)
Sturmian Sequences:
Minimal factor complexity p(n) = n+1
Maximum phrase reuse potential
Weyl Equidistribution:
Irrational tiling slopes guarantee uniform density
at all scales
Hierarchy Configuration:
Encoding Strategy:
# Quasicryth compression pipeline:
# Level 1 (2-word phrases): Most frequent, low cost
# Level 2 (3-word phrases): Medium frequency phrases
# ...
# Level 10 (144-word phrases): Rare deep patterns
#
# Each level maintains Fibonacci-structured tiling
# ensuring that phrase vocabularies never collapse
#
# Arithmetic coding casts: each level encoded
# with context-aware probability models trained
# on compressed training data
Compression on enwik9 (1 GB English Wikipedia):
Scaling Property: As dataset size increases:
Bridge Between Mathematics and Practice:
Theoretical Grounding: Proves that structural properties from quasicrystal geometry (physics/materials science) solve fundamental CS problems
Unbounded Improvement: Unlike fixed-depth hierarchies, Fibonacci structures keep improving with larger datasets—architectural advantage grows
Generalization: Same principle applies to:
Adaptation to Other Domains: Can Fibonacci tilings improve hierarchical clustering, multi-level hashing, or tree-based indexing?
Hybrid Approaches: Combine Fibonacci hierarchies with modern neural compression or learned arithmetic coding
Scaling Experiments: Test on larger corpora (10+ GB) to validate unbounded scaling advantage
Physical Quasicrystals Inspiration: Other aperiodic structures (Penrose tilings, Thue-Morse sequences) may have complementary properties