diff --git a/tidy.Rmd b/tidy.Rmd index f436d17..38db179 100644 --- a/tidy.Rmd +++ b/tidy.Rmd @@ -171,7 +171,7 @@ Spreading is the opposite of gathering. You use it when an observation is scatte table2 ``` -To tidy this up, we first analysis the representation in similar way to `gather()`. This time, however, we only need two parameters: +To tidy this up, we first analyse the representation in similar way to `gather()`. This time, however, we only need two parameters: * The column that contains variable names, the `key` column. Here, it's `type`. @@ -380,7 +380,7 @@ stocks %>% `complete()` takes a set of columns, and finds all unique combinations. It then ensures the original dataset contains all those values, filling in explicit `NA`s where necessary. -There's one other important tool that you should know for working with missing values. Sometimes when a data source has primarily been used for data entry, missing values indicate the the previous value should be carried forward: +There's one other important tool that you should know for working with missing values. Sometimes when a data source has primarily been used for data entry, missing values indicate that the previous value should be carried forward: ```{r} treatment <- frame_data( @@ -407,7 +407,7 @@ treatment %>% ## Case Study -To finish off the chapter, let's pull together everything you've learned to tackle a realistic data tidying problem. The `tidyr::who` dataset contains reporter tuberculosis (TB) cases broken down by year, country, age, gender, and diagnosis method. The data comes from the *2014 World Health Organization Global Tuberculosis Report*, available for download at . +To finish off the chapter, let's pull together everything you've learned to tackle a realistic data tidying problem. The `tidyr::who` dataset contains reporter tuberculosis (TB) cases broken down by year, country, age, gender, and diagnosis method. The data comes from the *2014 World Health Organization Global Tuberculosis Report*, available for download at . There's a wealth of epidemiological information in this dataset, but it's challenging to work with the data in the form that it's provided: