Minor updates to reduce page count

This commit is contained in:
Hadley Wickham 2023-01-24 09:33:18 -06:00
parent 2316168cd4
commit c9d9426f67
8 changed files with 25 additions and 37 deletions

View File

@ -14,6 +14,7 @@ options(
dplyr.print_min = 6,
dplyr.print_max = 6,
pillar.max_footer_lines = 2,
pillar.min_chars = 15,
stringr.view_n = 6,
# Activate crayon output - temporarily disabled for quarto
# crayon.enabled = TRUE,

View File

@ -41,7 +41,6 @@ We will also need nycflights13 for practice data.
```{r}
#| message: false
library(tidyverse)
library(nycflights13)
```

View File

@ -140,17 +140,6 @@ gss_cat |>
count(race)
```
Or with a bar chart:
```{r}
#| fig-alt: >
#| A bar chart showing the distribution of race. There are ~2000
#| records with race "Other", 3000 with race "Black", and other
#| 15,000 with race "White".
ggplot(gss_cat, aes(x = race)) +
geom_bar()
```
When working with factors, the two most common operations are changing the order of the levels, and changing the values of the levels.
Those operations are described in the sections below.
@ -254,7 +243,7 @@ It takes a factor, `f`, and then any number of levels that you want to move to t
#| fig-alt: >
#| The same scatterplot but now "Not Applicable" is displayed at the
#| bottom of the y-axis. Generally there is a positive association
#| between income and age, and the income band with the highest average
#| between income and age, and the income band with the highethst average
#| age is "Not applicable".
ggplot(rincome_summary, aes(x = age, y = fct_relevel(rincome, "Not applicable"))) +
@ -276,8 +265,8 @@ This makes the plot easier to read because the colors of the line at the far rig
#| There is one line for each category of marital status: no answer,
#| never married, separated, divorced, widowed, and married. It is
#| a little hard to read the plot because the order of the legend is
#| unrelated to the lines on the plot.
#|
#| unrelated to the lines on the plot.
#|
#| Rearranging the legend makes the plot easier to read because the
#| legend colors now match the order of the lines on the far right
#| of the plot. You can see some unsuprising patterns: the proportion

View File

@ -56,6 +56,8 @@ When more than one variable is needed, the key is called a **compound key.** For
You can identify each airport by its three letter airport code, making `faa` the primary key.
```{r}
#| R.options:
#| width: 67
airports
```
@ -63,6 +65,8 @@ When more than one variable is needed, the key is called a **compound key.** For
You can identify a plane by its tail number, making `tailnum` the primary key.
```{r}
#| R.options:
#| width: 67
planes
```
@ -70,6 +74,8 @@ When more than one variable is needed, the key is called a **compound key.** For
You can identify each observation by the combination of location and time, making `origin` and `time_hour` the compound primary key.
```{r}
#| R.options:
#| width: 67
weather
```

View File

@ -421,13 +421,14 @@ repos |>
```
This has worked but the result is a little overwhelming: there are so many columns that tibble doesn't even print all of them!
We can see them all with `names()`:
We can see them all with `names()`; and here we look at the first 10:
```{r}
repos |>
unnest_longer(json) |>
unnest_wider(json) |>
names()
names() |>
head(10)
```
Let's select a few that look interesting:
@ -439,7 +440,7 @@ repos |>
select(id, full_name, owner, description)
```
You can use this to work back to understand how `gh_repos` was strucured: each child was a GitHub user containing a list of up to 30 GitHub repositories that they created.
You can use this to work back to understand how `gh_repos` was structured: each child was a GitHub user containing a list of up to 30 GitHub repositories that they created.
`owner` is another list-column, and since it contains a named list, we can use `unnest_wider()` to get at the values:

View File

@ -123,8 +123,6 @@ Regular expressions are very compact and use a lot of punctuation characters, so
Don't worry; you'll get better with practice, and simple patterns will soon become second nature.
Let's kick off that process by practicing with some useful stringr functions.
### Exercises
## Key functions {#sec-stringr-regex-funs}
Now that you've got the basics of regular expressions under your belt, let's use them with some stringr and tidyr functions.

View File

@ -516,7 +516,12 @@ stringr provides two useful tools for cases where your string is too long:
The following code shows these functions in action with a made-up string:
```{r}
x <- "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat."
x <- paste0(
"Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod ",
"tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim ",
"veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea",
"commodo consequat."
)
str_view(str_trunc(x, 30))
str_view(str_wrap(x, 30))

View File

@ -326,22 +326,10 @@ Here's a simple HTML table with two columns and three rows:
```{r}
html <- minimal_html("
<table class='mytable'>
<tr>
<th>x</th>
<th>y</th>
</tr>
<tr>
<td>1.5</td>
<td>2.7</td>
</tr>
<tr>
<td>4.9</td>
<td>1.3</td>
</tr>
<tr>
<td>7.2</td>
<td>8.1</td>
</tr>
<tr><th>x</th> <th>y</th></tr>
<tr><td>1.5</td> <td>2.7</td></tr>
<tr><td>4.9</td> <td>1.3</td></tr>
<tr><td>7.2</td> <td>8.1</td></tr>
</table>
")
```
@ -455,6 +443,7 @@ At the time we wrote this chapter, the page looked like @fig-scraping-imdb.
```{r}
#| label: fig-scraping-imdb
#| echo: false
#| fig-cap: >
#| Screenshot of the IMDb top movies web page taken on 2022-12-05.
#| fig-alt: >