Fix typos

2016-07-08 17:49:51 -07:00 · 2016-07-08 17:49:51 -07:00 · aa25bd56f4
parent 37979814ca
commit aa25bd56f4
1 changed files with 5 additions and 5 deletions
--- a/transform.Rmd
+++ b/transform.Rmd
@ -142,7 +142,7 @@ near(1 / 49 * 49, 1)

 ### Logical operators

-Multiple arguments to `filter()` are combined with "and". To get more complicated expressions, you can use boolean operators yourself:
+Multiple arguments to `filter()` are combined with "and". To get more complicated expressions, you can use Boolean operators yourself:

 ```{r, eval = FALSE}
 filter(flights, month == 11 | month == 12)
@ -160,7 +160,7 @@ Instead you can use the helpful `%in%` shortcut:
 filter(flights, month %in% c(11, 12))
 ```

-The following figure shows the complete set of boolean operations:
+The following figure shows the complete set of Boolean operations:

 ```{r bool-ops, echo = FALSE, fig.cap = "Complete set of boolean operations"}
 knitr::include_graphics("diagrams/transform-logical.png")
@ -247,7 +247,7 @@ filter(df, is.na(x) | x > 1)

 ## Arrange rows with `arrange()`

-`arrange()` works similarly to `filter()` except that instead of filtering or selecting rows, it reorders them. It takes a data frame, and a set of column names (or more complicated expressions) to order by. If you provide more than one column name, each additional column will be used to break ties in the values of preceding columns:
+`arrange()` works similarly to `filter()` except that instead of filtering or selecting rows, it reorders them. It takes a data frame and a set of column names (or more complicated expressions) to order by. If you provide more than one column name, each additional column will be used to break ties in the values of preceding columns:

 ```{r}
 arrange(flights, year, month, day)
@ -281,7 +281,7 @@ flights[order(flights$year, flights$month, flights$day), , drop = FALSE]

 ### Exercises

-1.  How could use `arrange()` to sort all missing values to the start?
+1.  How could you use `arrange()` to sort all missing values to the start?
    (Hint: use `is.na()`).
    
 1.  Sort `flights` to find the most delayed flights. Find the flights that
@ -629,7 +629,7 @@ ggplot(delays, aes(n, delay)) +
  geom_point()
 ```

-Not suprisingly, there is much more variation in the average delay when there are few flights. The shape of this plot is very characteristic: whenever you plot a mean (or many other summaries) vs. number of observations, you'll see that the variation decreases as the sample size increases.
+Not surprisingly, there is much more variation in the average delay when there are few flights. The shape of this plot is very characteristic: whenever you plot a mean (or many other summaries) vs. number of observations, you'll see that the variation decreases as the sample size increases.

 When looking at this sort of plot, it's often useful to filter out the groups with the smallest numbers of observations, so you can see more of the pattern and less of the extreme variation in the smallest groups. This is what the following code does, and also shows you a handy pattern for integrating ggplot2 into dplyr flows. It's a bit painful that you have to switch from `%>%` to `+`, but once you get the hang of it, it's quite convenient.