The Log-normal Distribution: Galton and McAlister's 1879 Law of the Geometric Mean, and Why the Long-Term Investor Must Compound Rather Than Average

Afternoon Edition · Mental Models · No. 8

The model: the shape of things that multiply

Francis Galton spent the 1870s measuring everything that would hold still long enough to be measured — the heights of men, the sweetness of peas, the keenness of eyesight, the size of fortunes. The bell curve was his religion, and most of what he measured obeyed it. But a stubborn minority did not. Incomes, the loudness a nerve could register, the weights of organs — these came out lopsided, bunched up against a low wall on the left and trailing away into a long tail on the right, with the average sitting well above the most common value. Galton suspected the asymmetry was not a flaw in the data but a different law, and being a gentleman of means rather than a mathematician, he posed the problem to a Cambridge friend, the physician and mathematician Donald McAlister. McAlister’s answer appeared in 1879 in the Proceedings of the Royal Society, in a short memoir called “The Law of the Geometric Mean” (vol. 29, pp. 367–376), printed immediately after Galton’s own note introducing it (pp. 365–367). The answer was as simple as it was consequential: for these quantities, it is not the value that is normally distributed but its logarithm. Take the logarithm of each income, and the bell curve reappears. The quantity itself is what we now call log-normal — and for a century it has occasionally been called the Galton, or McAlister, distribution in their honour.

The definition is exactly that compact: a positive quantity is log-normally distributed when its logarithm follows the normal distribution. What makes the model one of the load-bearing beams of clear thinking is the reason such quantities are common. The normal distribution — the bell curve examined in an earlier letter in this series — is what emerges when a great many small independent influences add together; the central limit theorem guarantees the sum converges to that symmetric shape. The log-normal is its multiplicative twin. When influences multiply rather than add — when each year’s wealth is last year’s wealth times a growth factor, when a city’s population is last decade’s times a factor, when a price is yesterday’s price times a return — then the logarithm of the outcome is a sum of the logarithms of the factors, and that sum obeys the same central limit theorem. Products of independent shocks converge to the log-normal exactly as sums converge to the normal. The French economist Robert Gibrat made the point the organising principle of his 1931 treatise Les Inégalités Économiques, where he called it the law of proportionate effect: if a thing grows by random percentages that do not depend on its current size, its eventual distribution is log-normal. The definitive mathematical treatment followed in 1957, in J. Aitchison and J. A. C. Brown’s Cambridge monograph The Lognormal Distribution, with Special Reference to its Uses in Economics, still the standard reference.

The one-sentence form of the model worth carrying into every valuation: when effects multiply instead of add, outcomes pile up below a modest median and trail away into a long right tail, so that the average is dragged far above the typical — and anything that compounds is governed by this shape, not by the bell curve.

The mechanism: why the average tells the truth about no one

The signature of the log-normal is an asymmetry with three landmarks that the bell curve collapses into one. In a normal distribution the mode (the most common value), the median (the middle value) and the mean (the arithmetic average) all sit on top of each other at the centre. In a log-normal distribution they separate and line up in a fixed order: the mode sits lowest, the median above it, and the mean highest of all, pulled up by the long right tail where a few enormous outcomes live. The arithmetic is precise. If the logarithm of a quantity has average μ and variance σ², the median of the quantity is e raised to μ, while its mean is e raised to μ plus half of σ². The mean exceeds the median by a factor that depends only on the variance — on the volatility. The more dispersed the multiplicative shocks, the further the average floats above the experience of the typical member, and the more the average becomes a figure that describes almost nobody.

For investors this is not an ornament of distribution theory; it is the central fact of compounding, and it runs in the opposite direction to intuition. Because wealth grows multiplicatively, the rate an owner actually compounds — the geometric mean of the period returns — is always lower than the arithmetic average of those returns, and lower by an amount governed by volatility. To a good approximation the compound growth rate equals the arithmetic mean minus half the variance. Mark Spitznagel, in Safe Haven: Investing for Financial Storms (2021), names the gap a “volatility tax … extracted by the multiplicative dynamics of compounding,” and the metaphor is exact: the more a return series swings, the larger the toll levied on the wealth that actually accumulates. The cleanest demonstration needs no algebra. Begin with 100, gain 50 per cent, then lose 50 per cent: one and a half times one half is three quarters, and the account holds 75. The arithmetic average of plus fifty and minus fifty is zero; the lived result is a loss of a quarter. The same asymmetry explains why a 50 per cent loss requires not a 50 but a 100 per cent gain to repair — multiplication does not forgive the way addition does. A bell-curve habit of mind, reasoning about returns as if they were symmetric wobbles around an average, misses all of this. The distribution of where wealth actually ends up is right-skewed even when each single period looks tame, because the periods compound rather than accumulate.

A path diagram contrasting a steady flat line of zero per cent returns holding capital at 100 against a volatile path that gains fifty per cent to 150 then loses fifty per cent to 75, illustrating that an arithmetic average of zero produces a compounded loss of twenty-five per cent — Figure 1. The volatility tax. A path that gains 50 per cent and then loses 50 per cent has an arithmetic average return of zero and a compounded return of minus 25 per cent. The gap — roughly half the variance — is the toll the multiplicative process levies on every volatile series. After Spitznagel (2021).

The empirical record: incomes, cities and share prices

The reach of the model is the best argument for keeping it close. Gibrat’s law of proportionate effect was not a conjecture in search of data; it was a description of data already in hand. Across the twentieth century the body of the income distribution, the distribution of firm sizes, and the distribution of city populations have all repeatedly been found to be approximately log-normal, precisely because each grows by proportionate random increments — a point Aitchison and Brown documented at length for economic variables in 1957. The same multiplicative logic governs the quantity nearest to this letter’s concern. In 1959, in a paper called “Brownian Motion in the Stock Market” (Operations Research, vol. 7, no. 2, pp. 145–173), the astrophysicist M. F. M. Osborne observed that it is not the change in a share price but the change in its logarithm that behaves like a random walk, which makes the price itself log-normal — the process now taught as geometric Brownian motion. Fourteen years later that assumption became the foundation of modern finance: the option-pricing formula of Fischer Black, Myron Scholes and Robert Merton (1973) is built on the premise that prices are log-normally distributed, and on that premise an entire industry of risk transfer was erected.

The premise has a consequence that investors meet whether or not they have heard of Gibrat. If individual share prices compound multiplicatively, then the distribution of long-run outcomes for single stocks must be sharply right-skewed: most cluster below the average, while a few in the long tail account for the bulk of the wealth. The data are unambiguous. J. P. Morgan’s Michael Cembalest, in the recurring study “The Agony and the Ecstasy: The Risks and Rewards of a Concentrated Stock Position” (first published 2004 and updated since), found that more than 40 per cent of all the companies that have ever been members of the Russell 3000 index eventually suffered what he calls a catastrophic loss — a fall of 70 per cent or more from peak that was never recovered — and that around two thirds of all stocks underperformed the index itself over their lifetimes. Hendrik Bessembinder, in “Do Stocks Outperform Treasury Bills?” (Journal of Financial Economics, vol. 129, 2018, pp. 440–457), reached the same shape from the opposite side: a majority of US common stocks since 1926 returned less than one-month Treasury bills over their entire lives, while the best-performing 4 per cent of companies account for the entire net dollar wealth the market has created. Bessembinder named the cause directly — positive skewness “attributable to … the effects of compounding.” That the index rises handsomely while the typical stock in it disappoints is not a paradox to be explained away; it is the log-normal signature, observed in the wild.

A horizontal bar chart of the right-skewed distribution of single-stock lifetime outcomes: more than forty per cent of Russell 3000 members suffered a catastrophic seventy per cent loss, about two-thirds underperformed the index, and a small four per cent of firms account for all net wealth creation — Figure 2. The right tail carries the market. The lifetime outcomes of individual stocks are log-normally skewed: most disappoint, a large minority are destroyed, and a thin tail of winners creates the aggregate gain. Sources: J. P. Morgan / Cembalest, “The Agony and the Ecstasy”; Bessembinder (2018).

Two episodes: when the model spoke, and when it broke

The first episode is the model’s coronation and the day its limits were exposed. Through the 1970s and early 1980s the Black–Scholes formula, resting on the log-normal-price assumption, spread through every trading floor; the proof that traders trusted it is that the implied volatilities they paid across different option strikes were nearly flat, exactly as a true log-normal world requires. Then came Monday, 19 October 1987. The two-month S&P 500 futures contract fell about 29 per cent in a single session. Under the log-normal model that underpinned the era’s risk management, a move of that size was a minus-twenty-seven-standard-deviation event, with a probability of roughly ten to the power of minus one hundred and sixty — a number so small, as Jens Jackwerth and Mark Rubinstein noted in “Recovering Probability Distributions from Option Prices” (Journal of Finance, vol. 51, 1996, pp. 1611–1632), that it should not be expected to occur once in many billions of lifetimes of the universe. It occurred anyway. Ever since, the implied-volatility curve for index options has worn a permanent downward skew: the market now charges far more for protection against large declines than any log-normal model would price, because it has learned that the real tail is fatter than the bell-curve-of-logarithms allows. Benoît Mandelbrot had warned of precisely this in 1963, observing that speculative prices move in wilder jumps than the Gaussian framework admits. The lesson is not that the model is worthless but that it is a first picture, not a complete one: log-normal describes the body of the distribution well and its extremes badly, and an investor who forgets the second half will be ambushed by the events that matter most.

The second episode is quieter and, for the long-term owner, more instructive. Between the end of 1999 and the end of 2009 the S&P 500 delivered an annualised total return of roughly minus 0.9 per cent — a cumulative loss of about 9 per cent across ten years including dividends, only the second negative-total-return decade in the index’s history, the first having been the 1930s. The period was not short of good years; several individual calendars showed cheerful double-digit gains. But two deep drawdowns, the collapse of the technology bubble and the financial crisis, taxed the compounding base so heavily that the survivors compounded almost nothing. An investor who had read only the arithmetic of the up-years would have badly mistaken the experience; the geometric truth was a lost decade. This is volatility drag made flesh — not an exotic tail event but the ordinary working of the multiplicative process, grinding a respectable-looking average down to a compounded standstill.

Application: three operating disciplines

01 · Compound, do not average — read the geometric mean

Whenever a return is summarised as an “average,” ask whether it is the arithmetic average, which describes a single typical year, or the compound rate, which describes the wealth an owner actually keeps — and treat the gap between them as a measure of how much volatility is quietly costing. A practical rule of thumb falls straight out of the mathematics: the rate you compound is approximately the advertised average minus half the variance of returns, so a strategy that posts a high mean by way of violent swings may compound far less than a duller one with the same mean. The discipline applies to every claimed track record, every back-test, every “we have averaged X per cent” pitch. The arithmetic mean of a volatile series is not a lie, but it answers a question no long-term owner is asking. Convert it, mentally, to the geometric rate before you let it impress you.

02 · Expect the median below the mean — judge the ensemble, not the typical holding

Because the outcomes of individual investments are right-skewed, a portfolio’s result will usually be carried by a minority of its holdings while the median position disappoints — and this is the design working, not failing. Two consequences follow. First, hold enough names that you are statistically present for the thin tail of large winners; a book concentrated in a handful of positions is a bet that you can identify the tail in advance, which the historical base rate counsels against. Second, do not judge a sound process by the experience of its median holding, which the distribution guarantees will underwhelm; judge it by the whole ensemble across a full cycle. The investor who culls every position that has merely lagged is pruning the very distribution from which the occasional hundred-bagger must come.

03 · Protect the compounding base — the left tail is the enemy of geometric growth

Since a loss and a gain of equal percentage do not cancel, the single most valuable act in a multiplicative world is to avoid the large permanent loss that resets the base. This is why margin of safety, position limits and a refusal of the kind of leverage that can convert a drawdown into a wipeout are not timidity but arithmetic: each is a way of keeping the variance term — the σ² that is subtracted from your compound rate — under control, and of staying far from the zero that ends compounding altogether. Spitznagel’s formal result is worth holding onto here: a position that lowers your arithmetic average but cuts the left tail can raise the geometric mean, leaving more money in the account, because it reduces the volatility tax by more than it reduces the headline return. The goal is not to maximise the average year. It is to maximise the rate at which capital survives and multiplies.

Three operating-discipline cards on a navy field: compound do not average and read the geometric mean; expect the median below the mean and judge the ensemble; protect the compounding base because the left tail is the enemy of geometric growth — Figure 3. Three disciplines that convert the log-normal distribution from a curiosity of statistics into daily practice: read the geometric mean, judge the ensemble, and protect the base. After Aitchison and Brown (1957), Buffett, and Spitznagel (2021).

How the long-term tradition has used it

The long-horizon equity tradition absorbed the mathematics of the log-normal long before it borrowed the vocabulary, and it shows up first in Warren Buffett’s almost monotonous insistence on not losing. His most-quoted maxim — “Rule No. 1: never lose money; Rule No. 2: never forget Rule No. 1,” collected in Janet Lowe’s Warren Buffett Speaks (1997) and across Lawrence Cunningham’s edition of The Essays of Warren Buffett — is usually read as folksy caution. It is in fact a statement about multiplicative dynamics: a permanent loss does not subtract from the compounding base, it divides it, and the division is unforgiving. The architecture of Berkshire Hathaway, built to compound per-share value across decades without the interruption of forced selling, is the same idea expressed as a corporate form.

Charlie Munger reduced it to an aphorism that has become the tradition’s shorthand — “the first rule of compounding: never interrupt it unnecessarily” — and the patience he preached throughout Poor Charlie’s Almanack (2005) is precisely the refusal to let avoidable losses or needless turnover levy the volatility tax. Mark Spitznagel made the point quantitative in Safe Haven (2021), demonstrating that a sliver of cost-effective protection which lowers the arithmetic mean can nonetheless raise the geometric mean of a portfolio over a long history, because cutting the left tail of the multiplicative process is worth more than the premium it costs. Howard Marks has written the same asymmetry from the asset-management side for four decades, arguing in his Oaktree memos and in The Most Important Thing (2011) that avoiding the losers matters more than picking the winners — that in a world where outcomes are skewed and the downside is permanent, the surest route to a good compound result is to make sure you are still in the game to enjoy the right tail. Four practitioners, one underlying equation: the rate at which money grows is the arithmetic of how rarely it is allowed to shrink.

Key takeaways

The log-normal is the bell curve’s multiplicative twin. Where independent shocks that add produce the normal distribution, shocks that multiply — wealth, prices, firm and city sizes — produce the log-normal: bunched low, with a long right tail. Discovered by Galton and McAlister (1879); formalised by Gibrat (1931) and Aitchison and Brown (1957).
The average describes no one. In a log-normal world the mean sits above the median by a factor that grows with volatility, so the arithmetic average of returns systematically overstates what an owner actually compounds.
Volatility is a tax on compounding. The compound rate is roughly the arithmetic mean minus half the variance; a 50 per cent gain followed by a 50 per cent loss leaves you down 25 per cent, and a 50 per cent loss needs a 100 per cent gain to repair (Spitznagel, 2021).
The right tail carries the market. More than 40 per cent of Russell 3000 members ever suffered an unrecovered 70 per cent loss, two thirds underperformed the index, and roughly 4 per cent of firms produced all net wealth creation since 1926 (Cembalest; Bessembinder, 2018) — the index rises while the typical stock disappoints.
The disciplines are arithmetic, not temperament. Read the geometric mean, not the advertised average; judge the ensemble, not the median holding; and above all protect the compounding base, because in a multiplicative world the permanent loss is the only truly expensive mistake.

— Manish Goel, FCA / NorthPath Advisory OÜ / Tallinn, Estonia

Important.
All content on this site and in this email is journalism and education for a general audience. Nothing here constitutes investment advice or a recommendation in respect of any specific financial instrument, nor an offer or solicitation to buy or sell any security. Readers should consult an authorised financial adviser regulated in their own jurisdiction before making any investment decision.

The Log-normal Distribution: Galton and McAlister’s 1879 Law of the Geometric Mean, and Why the Long-Term Investor Must Compound Rather Than Average

The model: the shape of things that multiply

The mechanism: why the average tells the truth about no one

The empirical record: incomes, cities and share prices

Two episodes: when the model spoke, and when it broke

Application: three operating disciplines

01 · Compound, do not average — read the geometric mean

02 · Expect the median below the mean — judge the ensemble, not the typical holding

03 · Protect the compounding base — the left tail is the enemy of geometric growth

How the long-term tradition has used it

Key takeaways

More posts

Ergodicity: George Birkhoff’s 1931 Theorem, the Difference Between the Average and the Path, and Why the Long-Term Investor Must Survive Every Draw

Unrealistic Optimism: Neil Weinstein’s 1980 Discovery That We Each Expect a Better-Than-Average Future, and Why the Long-Term Equity Investor Systematically Underprices the Downside

Markov Chains: Why the Future Depends Only on the Present

The Wrong Question: Attribute Substitution and the Good-Company Trap