From a43a2e82dc598f158b4857bc26e3abd0874934d4 Mon Sep 17 00:00:00 2001 From: jjchern Date: Mon, 23 May 2016 18:33:30 -0500 Subject: [PATCH] "vs" should be "vs." --- transform.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/transform.Rmd b/transform.Rmd index 2441de5..d77fbbc 100644 --- a/transform.Rmd +++ b/transform.Rmd @@ -681,7 +681,7 @@ ggplot(delays, aes(n, delay)) + geom_point() ``` -Not suprisingly, there is much more variation in the average delay when there are few flights. The shape of this plot is very characteristic: whenever you plot a mean (or many other summaries) vs number of observations, you'll see that the variation decreases as the sample size increases. +Not suprisingly, there is much more variation in the average delay when there are few flights. The shape of this plot is very characteristic: whenever you plot a mean (or many other summaries) vs. number of observations, you'll see that the variation decreases as the sample size increases. When looking at this sort of plot, it's often useful to filter out the groups with the smallest numbers of observations, so you can see more of the pattern and less of the extreme variation in the smallest groups. This is what the following code does, and also shows you a handy pattern for integrating ggplot2 into dplyr flows. It's a bit painful that you have to switch from `%>%` to `+`, but once you get the hang of it, it's quite convenient.