Consistently use `smaller` dataset in section

Fixes #1252
This commit is contained in:
Hadley Wickham 2023-02-07 10:37:50 -06:00
parent e51ead6c9e
commit 03f1c6c6f4
1 changed files with 5 additions and 3 deletions

View File

@ -560,7 +560,7 @@ For larger plots, you might want to try the heatmaply package, which creates int
You've already seen one great way to visualize the covariation between two numerical variables: draw a scatterplot with `geom_point()`.
You can see covariation as a pattern in the points.
For example, you can see an exponential relationship between the carat size and price of a diamond.
For example, you can see an exponential relationship between the carat size and price of a diamond:
```{r}
#| dev: "png"
@ -568,10 +568,12 @@ For example, you can see an exponential relationship between the carat size and
#| A scatterplot of price vs. carat. The relationship is positive, somewhat
#| strong, and exponential.
ggplot(diamonds, aes(x = carat, y = price)) +
ggplot(smaller, aes(x = carat, y = price)) +
geom_point()
```
(In this section we'll use the `smaller` dataset to stay focused on the bulk of the diamonds that are smaller than 3 carats)
Scatterplots become less useful as the size of your dataset grows, because points begin to overplot, and pile up into areas of uniform black (as above).
You've already seen one way to fix the problem: using the `alpha` aesthetic to add transparency.
@ -583,7 +585,7 @@ You've already seen one way to fix the problem: using the `alpha` aesthetic to a
#| the number of points is higher than other areas, The most obvious clusters
#| are for diamonds with 1, 1.5, and 2 carats.
ggplot(diamonds, aes(x = carat, y = price)) +
ggplot(smaller, aes(x = carat, y = price)) +
geom_point(alpha = 1 / 100)
```