After Width: | Height: | Size: 26 KiB |
Before Width: | Height: | Size: 29 KiB After Width: | Height: | Size: 29 KiB |
Before Width: | Height: | Size: 46 KiB After Width: | Height: | Size: 46 KiB |
Before Width: | Height: | Size: 32 KiB After Width: | Height: | Size: 33 KiB |
Before Width: | Height: | Size: 116 KiB After Width: | Height: | Size: 118 KiB |
After Width: | Height: | Size: 36 KiB |
After Width: | Height: | Size: 26 KiB |
|
@ -388,7 +388,25 @@ Instead you can use a semi-join, which connects the two tables like a mutating j
|
||||||
flights %>% semi_join(top_dest)
|
flights %>% semi_join(top_dest)
|
||||||
```
|
```
|
||||||
|
|
||||||
The inverse of a semi-join is an anti-join. An anti-join keeps the rows that _don't_ have a match, and are useful for diagnosing join mismatches. For example, when connecting `flights` and `planes`, you might be interested to know that there are many `flights` that don't have a match in `planes`:
|
Graphically, a semi-join looks like this:
|
||||||
|
|
||||||
|
```{r, echo = FALSE, out.width = "50%"}
|
||||||
|
knitr::include_graphics("diagrams/join-semi.png")
|
||||||
|
```
|
||||||
|
|
||||||
|
Only the existence of a match is important; it doesn't match what observation is matched. This means that filtering joins never duplicate rows like mutating joins do:
|
||||||
|
|
||||||
|
```{r, echo = FALSE, out.width = "50%"}
|
||||||
|
knitr::include_graphics("diagrams/join-semi-many.png")
|
||||||
|
```
|
||||||
|
|
||||||
|
The inverse of a semi-join is an anti-join. An anti-join keeps the rows that _don't_ have a match:
|
||||||
|
|
||||||
|
```{r, echo = FALSE, out.width = "50%"}
|
||||||
|
knitr::include_graphics("diagrams/join-anti.png")
|
||||||
|
```
|
||||||
|
|
||||||
|
Anti-joins are are useful for diagnosing join mismatches. For example, when connecting `flights` and `planes`, you might be interested to know that there are many `flights` that don't have a match in `planes`:
|
||||||
|
|
||||||
```{r}
|
```{r}
|
||||||
flights %>%
|
flights %>%
|
||||||
|
|