Spell check suggestions (#259)

This commit is contained in:
harrismcgehee 2016-08-15 08:33:05 -04:00 committed by Hadley Wickham
parent f9901e3e54
commit 2c0c6a8be5
1 changed files with 7 additions and 7 deletions

View File

@ -119,7 +119,7 @@ The second step is to resolve one of two common problems:
1. One variable might be spread across multiple columns. 1. One variable might be spread across multiple columns.
1. One observation might be scattered across mutliple rows. 1. One observation might be scattered across multiple rows.
Typically a dataset will only suffer from one of these problems; it'll only suffer from both if you're really unlucky! To fix these problems, you'll need the two most important functions in tidyr: `gather()` and `spread()`. Typically a dataset will only suffer from one of these problems; it'll only suffer from both if you're really unlucky! To fix these problems, you'll need the two most important functions in tidyr: `gather()` and `spread()`.
@ -185,10 +185,10 @@ To tidy this up, we first analyse the representation in similar way to `gather()
* The column that contains variable names, the `key` column. Here, it's * The column that contains variable names, the `key` column. Here, it's
`type`. `type`.
* The column that contains values froms multiple variables, the `value` * The column that contains values forms multiple variables, the `value`
column. Here it's `count`. column. Here it's `count`.
Once we've figured that out, we can use `spread()`, as shown progammatically below, and visually in Figure \@ref(fig:tidy-spread). Once we've figured that out, we can use `spread()`, as shown programmatically below, and visually in Figure \@ref(fig:tidy-spread).
```{r} ```{r}
spread(table2, key = type, value = count) spread(table2, key = type, value = count)
@ -317,7 +317,7 @@ table5 %>%
unite(new, century, year) unite(new, century, year)
``` ```
In this case we also need to use the `sep` arguent. The default will place an underscore (`_`) between the values from different columns. Here we don't want any separator so we use `""`: In this case we also need to use the `sep` argument. The default will place an underscore (`_`) between the values from different columns. Here we don't want any separator so we use `""`:
```{r} ```{r}
table5 %>% table5 %>%
@ -345,7 +345,7 @@ table5 %>%
## Missing values ## Missing values
Changing the representation of a dataset brings up an important subtlety of missing values. Suprisingly, a value can be missing in one of two possible ways: Changing the representation of a dataset brings up an important subtlety of missing values. Surprisingly, a value can be missing in one of two possible ways:
* __Explicitly__, i.e. flagged with `NA`. * __Explicitly__, i.e. flagged with `NA`.
* __Implicitly__, i.e. simply not present in the data. * __Implicitly__, i.e. simply not present in the data.
@ -442,7 +442,7 @@ The best place to start is almost always to gathering together the columns that
in the variable names (e.g. `new_sp_m014`, `new_ep_m014`, `new_ep_f014`) in the variable names (e.g. `new_sp_m014`, `new_ep_m014`, `new_ep_f014`)
these are likely to be values, not variables. these are likely to be values, not variables.
So we need to gather together all the columns from `new_sp_m3544` to `newrel_f65`. We don't know what those values represent yet, so we'll give them the generic name `"key"`. We know the cells repesent the count of cases, so we'll use the variable `cases`. There are a lot of missing values in the current representation, so for now we'll use `na.rm` just so we can focus on the values that are present. So we need to gather together all the columns from `new_sp_m3544` to `newrel_f65`. We don't know what those values represent yet, so we'll give them the generic name `"key"`. We know the cells represent the count of cases, so we'll use the variable `cases`. There are a lot of missing values in the current representation, so for now we'll use `na.rm` just so we can focus on the values that are present.
```{r} ```{r}
who1 <- who %>% who1 <- who %>%
@ -550,7 +550,7 @@ who %>%
## Non-tidy data ## Non-tidy data
Before we continue on to other topics, it's worth talking briefly about non-tidy data. Earlier in the chapter, I used the perjorative term "messy" to refer to non-tidy data. That's an oversimplification: there are lots of useful and well founded data structures that are not tidy data. There are two mains reasons to use other data structures: Before we continue on to other topics, it's worth talking briefly about non-tidy data. Earlier in the chapter, I used the pejorative term "messy" to refer to non-tidy data. That's an oversimplification: there are lots of useful and well founded data structures that are not tidy data. There are two mains reasons to use other data structures:
* Alternative representations may have substantial performance or space * Alternative representations may have substantial performance or space
advantages. advantages.