There's something about `.by` (#1351)

Fixes #1242
This commit is contained in:
Hadley Wickham 2023-03-09 15:07:18 -06:00 committed by GitHub
parent 64841ccd3e
commit 8c03ddc730
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 33 additions and 0 deletions

View File

@ -699,6 +699,39 @@ daily |>
You get a single row back because dplyr treats all the rows in an ungrouped data frame as belonging to one group.
### `.by`
dplyr 1.1.0 includes an new, experimental, syntax for per-operation grouping, the `.by` argument.
`group_by()` and `ungroup()` aren't going away, but you can now also use the `.by` argument to group within a single operation:
```{r}
#| results: false
flights |>
summarize(
delay = mean(dep_delay, na.rm = TRUE),
n = n(),
.by = month
)
```
Or if you want to group by multiple variables:
```{r}
#| results: false
flights |>
summarize(
delay = mean(dep_delay, na.rm = TRUE),
n = n(),
.by = c(origin, dest)
)
```
`.by` works with all verbs and has the advantage that you don't need to use the `.groups` argument to suppress the grouping message or `ungroup()` when you're done.
We didn't focus on this syntax in this chapter because it was very new when wrote the book.
We did want to mention it because we think it has a lot of promise and it's likely to be quite popular.
You can learn more about it in the [dplyr 1.1.0 blog post](https://www.tidyverse.org/blog/2023/02/dplyr-1-1-0-per-operation-grouping/).
### Exercises
1. Which carrier has the worst average delays?