This commit is contained in:
Mitsuo Shiota 2023-04-08 23:36:31 +09:00 committed by GitHub
parent 9483bef03f
commit e00ecd35ba
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 2 additions and 2 deletions

View File

@ -291,7 +291,7 @@ df |>
How does the reshaping work? How does the reshaping work?
It's easier to see if we think about it column by column. It's easier to see if we think about it column by column.
As shown in @fig-pivot-variables, the values in column that was already a variable in the original dataset (`var`) need to be repeated, once for each column that is pivoted. As shown in @fig-pivot-variables, the values in column that was already a variable in the original dataset (`id`) need to be repeated, once for each column that is pivoted.
```{r} ```{r}
#| label: fig-pivot-variables #| label: fig-pivot-variables
@ -360,7 +360,7 @@ There are two columns that are already variables and are easy to interpret: `cou
They are followed by 56 columns like `sp_m_014`, `ep_m_4554`, and `rel_m_3544`. They are followed by 56 columns like `sp_m_014`, `ep_m_4554`, and `rel_m_3544`.
If you stare at these columns for long enough, you'll notice there's a pattern. If you stare at these columns for long enough, you'll notice there's a pattern.
Each column name is made up of three pieces separated by `_`. Each column name is made up of three pieces separated by `_`.
The first piece, `sp`/`rel`/`ep`, describes the method used for the diagnosis, the second piece, `m`/`f` is the `gender` (coded as a binary variable in this dataset), and the third piece, `014`/`1524`/`2535`/`3544`/`4554`/`65` is the `age` range (`014` represents 0-14, for example). The first piece, `sp`/`rel`/`ep`, describes the method used for the diagnosis, the second piece, `m`/`f` is the `gender` (coded as a binary variable in this dataset), and the third piece, `014`/`1524`/`2534`/`3544`/`4554`/`5564/``65` is the `age` range (`014` represents 0-14, for example).
So in this case we have six pieces of information recorded in `who2`: the country and the year (already columns); the method of diagnosis, the gender category, and the age range category (contained in the other column names); and the count of patients in that category (cell values). So in this case we have six pieces of information recorded in `who2`: the country and the year (already columns); the method of diagnosis, the gender category, and the age range category (contained in the other column names); and the count of patients in that category (cell values).
To organize these six pieces of information in six separate columns, we use `pivot_longer()` with a vector of column names for `names_to` and instructors for splitting the original variable names into pieces for `names_sep` as well as a column name for `values_to`: To organize these six pieces of information in six separate columns, we use `pivot_longer()` with a vector of column names for `names_to` and instructors for splitting the original variable names into pieces for `names_sep` as well as a column name for `values_to`: