typos in relational-data.Rmd

This commit is contained in:
TJ Mahr 2016-01-21 07:57:25 -06:00
parent a61f166012
commit 9c0bd89f99
1 changed files with 4 additions and 4 deletions

View File

@ -316,7 +316,7 @@ So far, the pairs of tables have always been joined by a single variable, and th
a suffix.
* A named character vector: `by = c("a" = "b")`. This will
match variable `a` in table `x` to variable `y` in table `b`. The
match variable `a` in table `x` to variable `b` in table `y`. The
variables from `x` will be used in the output.
For example, if we want to draw a map we need to combine the flights data
@ -429,7 +429,7 @@ Graphically, a semi-join looks like this:
knitr::include_graphics("diagrams/join-semi.png")
```
Only the existence of a match is important; it doesn't match what observation is matched. This means that filtering joins never duplicate rows like mutating joins do:
Only the existence of a match is important; it doesn't matter which observation is matched. This means that filtering joins never duplicate rows like mutating joins do:
```{r, echo = FALSE, out.width = "50%"}
knitr::include_graphics("diagrams/join-semi-many.png")
@ -467,7 +467,7 @@ flights %>%
The data you've been working with in this chapter has been cleaned up so that you'll have as few problems as possible. Your own data is unlikely to be so nice, so there are a few things that you should do with your own data to make your joins go smoothly.
1. Start by identifying the variables that form the primary key in each table.
You should usually do this based on your understand of the data, not
You should usually do this based on your understanding of the data, not
empirically by looking for a combination of variables that give a
unique identifier. If you just look for variables without thinking about
what they mean, you might get (un)lucky and find a combination that's
@ -490,7 +490,7 @@ The data you've been working with in this chapter has been cleaned up so that yo
use of inner vs. outer joins, carefully considering whether or not you
want to drop rows that don't have a match.
Be aware that simply checking the number of rows before and after the join is not sufficient to ensure that your join has gone smoothly. If you have an inner join with duplicate keys in both tables, you might get unlikely at the number of dropped rows might exactly equal the number of duplicated rows!
Be aware that simply checking the number of rows before and after the join is not sufficient to ensure that your join has gone smoothly. If you have an inner join with duplicate keys in both tables, you might get unlucky as the number of dropped rows might exactly equal the number of duplicated rows!
## Set operations {#set-operations}