diff --git a/data-transform.qmd b/data-transform.qmd index f6fb73b..c279617 100644 --- a/data-transform.qmd +++ b/data-transform.qmd @@ -699,9 +699,9 @@ ggplot(delays, aes(x = n, y = delay)) + ``` Not surprisingly, there is much greater variation in the average delay when there are few flights for a given plane. -The shape of this plot is very characteristic: whenever you plot a mean (or other summary) vs. group size, you'll see that the variation decreases as the sample size increases[^data-transform-4]. +The shape of this plot is very characteristic: whenever you plot a mean (or other summary statistics) vs. group size, you'll see that the variation decreases as the sample size increases[^data-transform-4]. -[^data-transform-4]: \*cough\* the central limit theorem \*cough\*. +[^data-transform-4]: \*cough\* the law of large numbers \*cough\*. When looking at this sort of plot, it's often useful to filter out the groups with the smallest numbers of observations, so you can see more of the pattern and less of the extreme variation in the smallest groups: