After Width: | Height: | Size: 26 KiB |
Before Width: | Height: | Size: 29 KiB After Width: | Height: | Size: 29 KiB |
Before Width: | Height: | Size: 46 KiB After Width: | Height: | Size: 46 KiB |
Before Width: | Height: | Size: 32 KiB After Width: | Height: | Size: 33 KiB |
Before Width: | Height: | Size: 116 KiB After Width: | Height: | Size: 118 KiB |
After Width: | Height: | Size: 36 KiB |
After Width: | Height: | Size: 26 KiB |
|
@ -388,7 +388,25 @@ Instead you can use a semi-join, which connects the two tables like a mutating j
|
|||
flights %>% semi_join(top_dest)
|
||||
```
|
||||
|
||||
The inverse of a semi-join is an anti-join. An anti-join keeps the rows that _don't_ have a match, and are useful for diagnosing join mismatches. For example, when connecting `flights` and `planes`, you might be interested to know that there are many `flights` that don't have a match in `planes`:
|
||||
Graphically, a semi-join looks like this:
|
||||
|
||||
```{r, echo = FALSE, out.width = "50%"}
|
||||
knitr::include_graphics("diagrams/join-semi.png")
|
||||
```
|
||||
|
||||
Only the existence of a match is important; it doesn't match what observation is matched. This means that filtering joins never duplicate rows like mutating joins do:
|
||||
|
||||
```{r, echo = FALSE, out.width = "50%"}
|
||||
knitr::include_graphics("diagrams/join-semi-many.png")
|
||||
```
|
||||
|
||||
The inverse of a semi-join is an anti-join. An anti-join keeps the rows that _don't_ have a match:
|
||||
|
||||
```{r, echo = FALSE, out.width = "50%"}
|
||||
knitr::include_graphics("diagrams/join-anti.png")
|
||||
```
|
||||
|
||||
Anti-joins are are useful for diagnosing join mismatches. For example, when connecting `flights` and `planes`, you might be interested to know that there are many `flights` that don't have a match in `planes`:
|
||||
|
||||
```{r}
|
||||
flights %>%
|
||||
|
|