diff --git a/data-transform.qmd b/data-transform.qmd index 897ef31..9cc4d7d 100644 --- a/data-transform.qmd +++ b/data-transform.qmd @@ -91,7 +91,7 @@ The code starts with the `flights` dataset, then filters it, then groups it, the We'll come back to the pipe and its alternatives in @sec-pipes. dplyr's verbs are organised into four groups based on what they operate on: **rows**, **columns**, **groups**, or **tables**. -In the following sections you'll learn the most important verbs for rows, columns, and groups, then we'll come back to verb that work on tables in @sec-joins. +In the following sections you'll learn the most important verbs for rows, columns, and groups, then we'll come back to verbs that work on tables in @sec-joins. Let's dive in! ## Rows @@ -238,7 +238,7 @@ Note that if you want to find the number of duplicates, or rows that weren't dup 5. Which flights traveled the farthest distance? Which traveled the least distance? -6. Does it matter what order you used `filter()` and `arrange()` in if you're using both? +6. Does it matter what order you used `filter()` and `arrange()` if you're using both? Why/why not? Think about the results and how much work the functions would have to do. @@ -447,7 +447,7 @@ This means subsequent operations will now work "by month". ### `summarize()` {#sec-summarize} -The most important grouped operation is a summary, which each collapses each group to a single row. +The most important grouped operation is a summary, which collapses each group to a single row. In dplyr, this is operation is performed by `summarize()`[^data-transform-3], as shown by the following example, which computes the average departure delay by month: [^data-transform-3]: Or `summarise()`, if you prefer British English.