diff --git a/_includes/package-nav.html b/_includes/package-nav.html index c2bd686..1cd532a 100644 --- a/_includes/package-nav.html +++ b/_includes/package-nav.html @@ -1,14 +1,15 @@
  • Introduction
  • -
  • Tidy data
  • +
  • Transform
  • + +
  • Tidy
  • +
  • Import
  • + diff --git a/import.Rmd b/import.Rmd new file mode 100644 index 0000000..1cd2772 --- /dev/null +++ b/import.Rmd @@ -0,0 +1,24 @@ +--- +layout: default +title: Data import +output: bookdown::html_chapter +--- + +## Overview + +You can't apply any of the tools you've applied so far to your own work, unless you can get your own data into R. In this chapter, you'll learn how to import: + +* Flat files (like csv) with readr. +* Database queries with DBI. +* Data from web APIs with httr. +* Binary file formats (like excel or sas), with haven and readxl. + +## Flat files + +## Databases + +## Web APIs + +## Binary files + +Needs to discuss how data types in different languages are converted to R. Similarly for missing values. diff --git a/transform.Rmd b/transform.Rmd new file mode 100644 index 0000000..66ef687 --- /dev/null +++ b/transform.Rmd @@ -0,0 +1,27 @@ +--- +layout: default +title: Data transformation +output: bookdown::html_chapter +--- + + +## Missing values + +* Why `NA == NA` is not `TRUE` +* Why default is `na.rm = FALSE`. + +## Data types + +Overview of different data types and useful summary functions for working with them. Strings and dates covered in more detail in future chapters. + +Need to mention `typeof()` vs. `class()` mostly in context of how date/times and factors are built on top of simpler structures. + +### Logical + +When used with numeric functions, `TRUE` is converted to 1 and `FALSE` to 0. This makes `sum()` and `mean()` particularly useful: `sum(x)` gives the number of `TRUE`s in `x`, and `mean(x)` gives the proportion. + +### Numeric (integer and double) + +### Strings (and factors) + +### Date/times