Wednesday, July 25, 2012

VISUALIZING THE CENTRAL LIMIT THEOREM



I will not go into the inner workings of the central limit theorem (CLT) here and rather show how I present it graphically to my students. 

Every decent textbook has some graphs that show how the distribution of the means starts to become normally distributed as n increases and how the variance of the means decreases. However, I believe that students pay much more attention to this topic if they see it in action and potentially play with it themselves!

I created three Excel files that allow showing the CLT in class:
1) for a normal distribution
2) for a uniform distribution
3) for a wild and random distribution

In all three files the distribution, means, etc. are on the first sheet. In the next two sheets, I present the distribution of 10000 sample means for n = 1, 2, 5, 10, and 30 as well as the original distribution (also based on n = 9999 or 10000 since I had to make the continuous distributions discrete). Plotting the 10000 means obviously don't give us the true distribution, but rather an approximation. Nevertheless the students see that the distributions come from real numbers and real draws, which I believe makes the point of the CLT stick more clearly. 


In one sheet, I kept the vertical scales identical which guarantees a visual equality of the areas under the curves. In the other I allowed Excel to pick the best maximum value for the vertical axis to best show the shape. Both have their pros and cons. Which one to use probably depends on the students' understanding of continuous distributions.

I kept the sample randomizer active so you can see how the sample mean distributions change slightly each time. Also if you want to change the population distribution in the wild distribution file, feel free to change the number in the yellow cells on the first sheet. In order to be able to count and display the means (rounded to one digit), the numbers need to be kept between 0 and 10! 


You might notice that I "cheated" when showing the original uniform distribution and the means for n = 1. Because the distribution is generated in 1/1000s and the graph in 1/10s the minimum and maximum vertical line would only be 1/2 the "correct" size. Thus I doubled the observations for those two to keep the picture in line with the pure continuous distribution.

Finally, I created three handouts with the six graphs on each page (two pages per handout - one per type of axis). Here the handout for the wild distribution. The links for all the files are below the graphs.




The handout links:

The Excel file links:




No comments:

Post a Comment