Central Limit Theorem (CLT)
The Central Limit Theorem is a fundamental concept in statistics that describes the behavior of sample means. In simple terms, it states that if you take many random samples from any population, the distribution of the sample means will tend to follow a normal (bell-shaped) curve, regardless of the original population's distribution, as long as the sample size is large enough.
Key Points
Population vs. Sample: Suppose you have a population with any shape (skewed, uniform, etc.) with mean μ and standard deviation σ. If you repeatedly draw random samples of size n and calculate their means, these sample means will form a distribution.
Mean of Sample Means: The average of all sample means will be approximately equal to the population mean μ.
Standard Deviation of Sample Means (Standard Error): The spread of the sample means will be σ / √n, meaning it gets smaller as n increases.
Normality: As n grows, the shape of the distribution of sample means becomes increasingly normal. A common rule of thumb is that n ≥ 30 is sufficient, but this depends on how much the population deviates from normality.
Why It Matters
The CLT allows us to make inferences about population parameters using sample statistics, even when we don't know the population's distribution. It underpins many statistical methods, like confidence intervals and hypothesis testing.
Example
Imagine rolling a fair six-sided die. The population distribution of a single roll is uniform (each number 1–6 equally likely). Now, take samples of 30 rolls, compute the average of each sample, and plot those averages. The CLT says that this collection of averages will look like a normal distribution centered around 3.5 (the true mean), with a standard deviation of about σ/√30, where σ is the standard deviation of a single die roll.
No comments:
Post a Comment