Writings‎ > ‎

Distribution of the day: the log-normal

February 2013

Fully defined by mean and variance, asymmetrically bound - the charms of the log-normal distribution are quite clear. One of the go-to distributions for curve fitting, it's regularly used to model common phenomena. Google 'log-normal electricity' for example and the first hit will explain that 'the log normal distribution was found to fit the historical [electricity market price] best'.

However until now I'd missed somehow the obvious - that the distribution's prevalence is easily explained by the extension of the Central Limit Theorem to the log-domain. So, whereas normals arise from the sum of many independent and identically distributed random variables [1], log-normals arise as a product of the same variables [2].

μ = 0. Guess σ2.

It's easy to imagine all kinds of process that are determined by the product of random variates, not just electricity prices. The financial Quants love this distribution because of their infatuation with compounded returns. Apparently also the latency periods of diseases follow a log-normal (obvious), the number of crystals in ice-cream (plausible), and the length of sentences in the works of George Bernard Shaw (incredible!).

Below is a brilliant physical demonstration of the process that leads to the log-normal distribution, contrasted with the normal [3]On the left the position of the bearings are determined by a sum of random displacements and collectively form a normal distribution. Conversely, on the right the positions are defined by a product of random events and collectively form a log-normal.

P.s. if you've forgotten the proof of the Central Limit Theorem a nice 'elementary, but slightly cumbersome proof'  is provided by Wolfram.

  1. http://en.wikipedia.org/wiki/Central_limit_theorem
  2. Evidently log-normals will also arise in variables that depend exponentially on a normally distributed variate (just covering my back...) 
  3. Log-normal Distributions across the Sciences: Keys and Clues