Minor corrections (#1244)

This commit is contained in:
Stephen Balogun 2023-01-23 23:23:26 +01:00 committed by GitHub
parent 01b8566680
commit 5d912aaed8
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 5 additions and 5 deletions

View File

@ -35,7 +35,7 @@ library(nycflights13)
## Keys
To understand joins, you need to first understand how two tables can be connected through a pair of keys, with on each table.
To understand joins, you need to first understand how two tables can be connected through a pair of keys, within each table.
In this section, you'll learn about the two types of key and see examples of both in the datasets of the nycflights13 package.
You'll also learn how to check that your keys are valid, and what to do if your table lacks a key.
@ -138,7 +138,7 @@ weather |>
### Surrogate keys
So far we haven't talked about the primary key for `flights`.
It's not super important here, because there are no data frames that use it as a foreign key, but it's still useful to consider because it's easier to work with observations if have some way to describe them to others.
It's not super important here, because there are no data frames that use it as a foreign key, but it's still useful to consider because it's easier to work with observations if we have some way to describe them to others.
After a little thinking and experimentation, we determined that there are three variables that together uniquely identify each flight:
@ -194,7 +194,7 @@ Surrogate keys can be particular useful when communicating to other humans: it's
## Basic joins {#sec-mutating-joins}
Now that you understand how data frames are connected via keys, we can start using joins to better understand the `flights` dataset.
dplyr provides six join functions: `left_join()`, `inner_join()`, `right_join()`, `semi_join()`, and `anti_join()`.
dplyr provides six join functions: `left_join()`, `inner_join()`, `right_join()`, `semi_join()`, `anti_join(), and full_join()`.
They all have the same interface: they take a pair of data frames (`x` and `y`) and return a data frame.
The order of the rows and columns in the output is primarily determined by `x`.
@ -321,7 +321,7 @@ airports |>
**Anti-joins** are the opposite: they return all rows in `x` that don't have a match in `y`.
They're useful for finding missing values that are **implicit** in the data, the topic of @sec-missing-implicit.
Implicitly missing values don't show up as `NA`s but instead only exist as an absence.
For example, we can find rows that as missing from `airports` by looking for flights that don't have a matching destination airport:
For example, we can find rows that are missing from `airports` by looking for flights that don't have a matching destination airport:
```{r}
flights2 |>

View File

@ -404,7 +404,7 @@ read_excel("data/bake-sale.xlsx")
### Formatted output
The readxl package is a light-weight solution for writing a simple Excel spreadsheet, but if you're interested in additional features like writing to sheets within a spreadsheet and styling, you will want to use the **openxlsx** package.
The writexl package is a light-weight solution for writing a simple Excel spreadsheet, but if you're interested in additional features like writing to sheets within a spreadsheet and styling, you will want to use the **openxlsx** package.
Note that this package is not part of the tidyverse so the functions and workflows may feel unfamiliar.
For example, function names are camelCase, multiple functions can't be composed in pipelines, and arguments are in a different order than they tend to be in the tidyverse.
However, this is ok.