The Packing

In 1611, Johannes Kepler was asked a simple question by his friend Thomas Harriot: what is the most efficient way to stack cannonballs? Kepler proposed that the arrangement used by every greengrocer — each ball nestled into the hollow formed by three below it, the pattern alternating in a regular way — is optimal. Each sphere touches twelve neighbors. The arrangement fills approximately 74.05 percent of the available space. No denser packing exists.

Kepler offered no proof. He published the conjecture as part of a pamphlet on snowflakes — Strena seu de Nive Sexangula, a New Year's gift for Harriot — and moved on. The conjecture stood unproven for 387 years. Carl Friedrich Gauss proved in 1831 that Kepler's arrangement is the densest regular packing. But the harder question remained: could some irregular, aperiodic arrangement beat it? In 1953, László Fejes Tóth showed that the problem could in principle be reduced to a finite number of cases, but the number was astronomical. Thomas Hales began his proof in 1992 and completed it in 1998, using interval arithmetic and exhaustive computer verification of over five thousand configurations. The referees, after four years of review, said they were 99 percent certain the proof was correct. Hales then spent the next sixteen years building a formal verification in the HOL Light and Isabelle proof assistants. The Flyspeck project completed in 2014. The conjecture was settled.

The answer had been sitting on every fruit stand in the world for four centuries. The proof required a supercomputer and two decades. The gap between the answer and the reason for the answer is the entire subject.


In 1948, Claude Shannon published "A Mathematical Theory of Communication" and established that every channel has a capacity: a maximum rate at which information can be transmitted with arbitrarily low error. The capacity depends on the channel's bandwidth and its noise level. Shannon proved the limit exists. He did not show how to reach it.

The problem of reaching it is, at bottom, a packing problem. Each message is a point in a high-dimensional space. The channel adds noise, displacing the received point from the transmitted one by some random amount. To avoid errors, the receiver must be able to determine which message was sent despite the displacement. This means the messages must be spaced far enough apart that their noise spheres — the regions of probable displacement around each point — do not overlap. Packing more messages into the space means packing non-overlapping spheres into a high-dimensional volume. The channel capacity is the logarithm of how many such spheres fit.

Richard Hamming saw this first, in 1950. Working at Bell Labs on a machine that stopped every time it detected an error, frustrated by weekends of lost computation, he designed a code that could correct single-bit errors by placing codewords at sufficient distance from each other in binary space. The Hamming distance between two codewords — the number of positions where they differ — is the coding-theory analogue of geometric distance. If every pair of codewords differs in at least three positions, a single error moves the received word closer to the intended codeword than to any other. The geometry corrects the error.

The connection between Kepler's cannonballs and Hamming's codewords is not a metaphor. In 1964, John Leech discovered a lattice packing in 24 dimensions — the Leech lattice — that is provably the densest packing in that dimensionality. Each sphere touches 196,560 neighbors. The same lattice underlies the Golay code, a perfect error-correcting code capable of correcting up to three errors in a 24-bit block. The densest sphere packing in 24-dimensional space IS the optimal error-correcting code for that block length. The geometry and the information theory are the same mathematics viewed from different angles.


In 1955, André Martinet published Économie des changements phonétiques and proposed that the sound systems of human languages obey an economy principle. A language's phonemes — its smallest meaning-distinguishing sound units — are arranged to cover the available acoustic space with maximum distinctiveness for minimum articulatory effort.

Consider vowels. The acoustic properties of a vowel are determined primarily by two formant frequencies, F1 and F2, which correspond roughly to tongue height and tongue advancement. These two parameters define a roughly triangular space. Languages with small vowel inventories — Arabic has three phonemic vowels — place them at the corners: /a/ (low, central), /i/ (high, front), /u/ (high, back). Maximum separation. Languages with larger inventories fill in the interior, spacing vowels as evenly as the perceptual geometry allows. French, with twelve or more vowel phonemes, packs them tightly enough that adjacent vowels are distinguished by small acoustic differences — the kind of distinction that, in Arabic, would fall below the threshold of significance.

The parallel to sphere packing is exact. Each phoneme occupies a region of acoustic space. Articulatory variation and listener uncertainty create a noise sphere around each phoneme's target. For communication to succeed, the noise spheres must not overlap — listeners must be able to determine which phoneme was intended despite the variation. The language's phonemic inventory is a packing of non-overlapping spheres in acoustic space, constrained by the same geometry that constrains cannonballs on a shelf and codewords in a binary string.

Martinet's insight was that this packing is not static. When one phoneme shifts — through sound change, dialectal variation, or contact with another language — neighboring phonemes shift to maintain separation. The Great Vowel Shift in English, which transformed the pronunciation of every long vowel between roughly 1400 and 1700, proceeded as a chain reaction. The vowel in "bite" — originally pronounced like modern "beet" — rose until it had nowhere higher to go and broke into the diphthong it is today. That movement opened space below, and the vowel in "beet" rose to fill it. Below that, "bait" rose into the vacated territory. Each displacement triggered the next. The system reorganized as a system, not one sound at a time, because the constraint is on the distances between sounds, not on the positions of individual sounds.


In 1959, G. Evelyn Hutchinson formalized the ecological niche as a hypervolume. Each axis represents an environmental variable — temperature range, humidity, food particle size, nesting height, activity period. A species' fundamental niche is the region of this space within which it can survive and reproduce. Its realized niche — the portion it actually occupies — is smaller, constrained by competition with other species that claim overlapping regions.

The competitive exclusion principle, demonstrated experimentally by Georgii Gause in 1934 with Paramecium cultures, states that two species cannot coexist on the same limiting resource in the same space indefinitely. One will displace the other. Coexistence requires niche differentiation — displacement along at least one axis of the hypervolume.

This is a packing problem. Each species occupies a region of niche space. Competition creates effective exclusion zones — the ecological equivalent of noise spheres. The community assembles by packing species into the available niche hypervolume with sufficient separation that competitive exclusion does not collapse any pair. Robert MacArthur's work on warbler foraging zones in the 1950s provided the first detailed measurements: five species of Dendroica warbler coexisting in the same spruce trees by partitioning their foraging into different height zones, different branch positions, and different movement patterns. The separation was not in geography. It was in the behavioral geometry of resource use.

When a species goes extinct, the niche space it occupied does not remain empty for long. Ecological release — the expansion of remaining species into vacated niche regions — proceeds rapidly, analogous to the annealing of a crystal lattice after a vacancy is created. The packing reorganizes. When an invasive species arrives, it must either carve out space by displacing residents or exploit an unoccupied region that the existing community had not packed. Either way, the community restructures around the new geometry of exclusion, just as vowels restructure around a shifted neighbor.


The connection across these systems is not analogy. It is identity.

Kepler's cannonballs, Shannon's codewords, Martinet's phonemes, and Hutchinson's niches are all instances of the same geometric problem: given a space with a metric and a minimum separation requirement, how many objects can you fit? The space differs — Euclidean for cannonballs, Hamming for codes, acoustic for phonemes, ecological for niches. The metric differs. The minimum separation requirement differs — physical radius, error tolerance, perceptual distinctiveness, competitive exclusion. But the mathematics is identical. The maximum packing density is determined by the geometry of the space and the size of the exclusion zones. The system cannot choose its capacity. The capacity is a property of the container.

This is why chain shifts occur in language, why adaptive radiation occurs in ecology, why lattice reorganization occurs in crystals, and why code design must be done globally rather than codeword by codeword. In a saturated packing, every element's position is constrained by every other element's position. Moving one requires moving others. The system's rigidity — its resistance to local change — is a direct measure of how fully it has packed the available space. A loosely packed system tolerates local perturbation. A densely packed one propagates it.

The answer is never a property of what is being packed. It is always a property of the space and the minimum distance that separates one thing from another. Kepler knew the answer. It took four centuries to understand why the answer was the answer.

Source Nodes

  1. Node #16993
  2. Node #8329
  3. Node #23723
  4. Node #549

← Back to essays