Apply sizing to tidy-data diagrams

This commit is contained in:
Hadley Wickham 2022-03-21 09:27:14 -05:00
parent 090333be5b
commit 678930b99b
5 changed files with 14 additions and 13 deletions

View File

@ -25,7 +25,6 @@ devtools::install_github("hadley/r4ds")
``` ```
#| echo: FALSE #| echo: FALSE
#| out.width: NULL #| out.width: NULL
#| fig.retina: 1.5
knitr::include_graphics("diagrams/transform.png", dpi = 270) knitr::include_graphics("diagrams/transform.png", dpi = 270)
## Code of Conduct ## Code of Conduct

View File

@ -50,7 +50,9 @@ There are three interrelated rules that make a dataset tidy:
Figure \@ref(fig:tidy-structure) shows the rules visually. Figure \@ref(fig:tidy-structure) shows the rules visually.
```{r tidy-structure, echo = FALSE, out.width = "100%"} ```{r tidy-structure}
#| echo: FALSE
#| out.width: NULL
#| fig.cap: > #| fig.cap: >
#| Following three rules makes a dataset tidy: variables are columns, #| Following three rules makes a dataset tidy: variables are columns,
#| observations are rows, and values are cells. #| observations are rows, and values are cells.
@ -59,7 +61,7 @@ Figure \@ref(fig:tidy-structure) shows the rules visually.
#| shows that each variable is column. The second panel shows that each #| shows that each variable is column. The second panel shows that each
#| observation is a row. The third panel shows that each value is #| observation is a row. The third panel shows that each value is
#| a cell. #| a cell.
knitr::include_graphics("images/tidy-1.png") knitr::include_graphics("images/tidy-1.png", dpi = 270)
``` ```
Why ensure that your data is tidy? Why ensure that your data is tidy?
@ -259,23 +261,23 @@ It's easier to see if we take it component by component.
Columns that are already variables need to be repeated, once for each column in `cols`, as shown in Figure \@ref(fig:pivot-variables). Columns that are already variables need to be repeated, once for each column in `cols`, as shown in Figure \@ref(fig:pivot-variables).
```{r pivot-variables} ```{r pivot-variables}
#| echo: false #| echo: FALSE
#| out.width: ~ #| out.width: NULL
#| fig.cap: > #| fig.cap: >
#| Columns that are already variables need to be repeated, once for #| Columns that are already variables need to be repeated, once for
#| each column that is pivotted. #| each column that is pivotted.
knitr::include_graphics("diagrams/tidy-data/variables.png", dpi = 144) knitr::include_graphics("diagrams/tidy-data/variables.png", dpi = 270)
``` ```
The column names become values in a new variable, whose name is given by `names_to`, as shown in Figure \@ref(fig:pivot-names). The column names become values in a new variable, whose name is given by `names_to`, as shown in Figure \@ref(fig:pivot-names).
They need to be repeated for each row in the original dataset. They need to be repeated for each row in the original dataset.
```{r pivot-names} ```{r pivot-names}
#| echo: false #| echo: FALSE
#| out.width: ~ #| out.width: NULL
#| fig.cap: > #| fig.cap: >
#| The column names of pivoted columns become a new column. #| The column names of pivoted columns become a new column.
knitr::include_graphics("diagrams/tidy-data/column-names.png", dpi = 144) knitr::include_graphics("diagrams/tidy-data/column-names.png", dpi = 270)
``` ```
The cell values also become values in a new variable, with name given by `values_to`. The cell values also become values in a new variable, with name given by `values_to`.
@ -283,12 +285,12 @@ The are unwound row by row.
Figure \@ref(fig:pivot-values) illustrates the process. Figure \@ref(fig:pivot-values) illustrates the process.
```{r pivot-values} ```{r pivot-values}
#| echo: false #| echo: FALSE
#| out.width: ~ #| out.width: NULL
#| fig.cap: > #| fig.cap: >
#| The number of values are preserved (not repeated), but unwound #| The number of values are preserved (not repeated), but unwound
#| row-by-row. #| row-by-row.
knitr::include_graphics("diagrams/tidy-data/cell-values.png", dpi = 144) knitr::include_graphics("diagrams/tidy-data/cell-values.png", dpi = 270)
``` ```
### Many variables in column names ### Many variables in column names
@ -328,7 +330,7 @@ The next step up in complexity is when the column names include a mix of variabl
For example, take the `household` dataset: For example, take the `household` dataset:
```{r} ```{r}
family household
``` ```
This dataset contains data about five families, with the names and dates of birth of up to two children. This dataset contains data about five families, with the names and dates of birth of up to two children.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 23 KiB

After

Width:  |  Height:  |  Size: 56 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 25 KiB

After

Width:  |  Height:  |  Size: 60 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 23 KiB

After

Width:  |  Height:  |  Size: 56 KiB