© Vassilis Hajivassiliou 1998-2021

Illustrating Central Limit Theorems

The Behaviour of the T-Ratio under I.i.d. Sampling

Select Four Sample Sizes:

Nobs1:

Nobs2:

Nobs3:

Nobs4:

Select One Distribution:

Normal

Exponential

Bernoulli

Cauchy

Pareto

Introductory Remarks About Distributions

The Normal Distribution
This is the classic "bell-shaped" curve that Gauss invented (hence, it is sometimes called the Gaussian distribution). It is symmetric, centered at the mean and has points of inflexion at mean + standard_deviation and mean - standard_deviation.
This is the easiest case for the LLN: The sample mean of observations from such a distribution is also Normally distributed, with the same true mean, and variance N times smaller than the original one, where N is the number of observations. This case illustrates very clearly the concept of convergence in mean-square-error.
In this case, the T-statistic (sample mean standardized for its standard deviation) is precisely that: it is distributed according to the student's T distribution, with N-1 degrees of freedom, where N is the number of observations in the experiment. This gives a trivial illustration of the CLT: since we know that a T distribution approaches a standard normal as the number of degrees of freedom grows, T will become normal as N grows to infinity.

The Bernoulli Distribution
This discrete distribution describes a random variable that takes two possible values, 1 (success) with probability p, and 0 (failure) with probability 1-p.
The LLN will imply that we should expect the sample mean (proportion) of successes to converge to the true population proportion p, and this should happen irrespective of the fact that the underlying distribution we are drawing from is discrete.
According to the CLT, the T-statistic (sample mean (proportion), standardized for its standard deviation) should converge as the number of observations grows, to a standard Normal (bell-shaped) curve. This version of the CLT is known as the DeMoivre-Laplace Theorem.

The Exponential Distribution
This is the non-negative distribution that is found to be a good model for things like the life-time of light-bulbs, etc. It has a single parameter: its mean. It falls uniformly as the value of the random variable grows to infinity.
The sample mean of observations drawn from such a distribution is a (rescaled) Gamma distribution. Hence, you should watch out for the LLN taking place, whereby the sample mean converges to the population mean, despite the fact that the underlying distribution is skewed (to the right).
According to the CLT, the T-statistic (sample mean standardized for its standard deviation) should converge as the number of observations grows, to a standard Normal (bell-shaped) curve, despite the skewness in the underlying distribution from which the observations are generated.

The Cauchy Distribution
This is the pathological distribution that while it looks just like the normal curve, being symmetric and bell-shaped, it does not have ANY finite moments, not even a mean. This is caused by its tails being "too fat". As a result, its two parameters are its median and its scale.
What this implies for the LLN experiments, is that the sample average does not converge to the true location because all LLN's require the mean of the underlying distribution to be finite. This is violated here and hence convergence will not take place. Indeed, a surprising fact is that the sample mean has exactly the same Cauchy distribution as the one of the underlying observations, irrespective of the sample size! Watch out for this fact.
CLT's also imply that the T-statistic (sample mean standardized for its standard deviation) will not converge, as the number of observations grows, to a standard Normal (bell-shaped) curve. In fact, a result by Phillips and Hajivassiliou shows that the T-statistic in this case will have a bi-modal distribution.

The Pareto Distribution
This case is very interesting because it allows one to have the LLN, the CLT, or both fail at will by choosing appropriately the single parameter of this distribution, theta.
This is because the Pareto distribution does not have a finite variance if theta is less than 2, and it also does not have a mean if theta is less than 1.
Hence, choosing a theta that exceeds 3 should show both the LLN and the CLT holding. A theta smaller than 1 should exhibit failure of both the sample mean and the sample T-statistic to converge, whereas a theta in between 1 and 2 will allow one to obtain convergence in probability of the sample mean (the LLN), but not of the T-statistic to the Normal curve (the CLT).