diff --git a/DESCRIPTION b/DESCRIPTION index 515c245..d66bbd1 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -12,7 +12,6 @@ Imports: broom, dplyr, DSR, - feather, gapminder, ggplot2, hexbin, diff --git a/import.Rmd b/import.Rmd index c8201df..0d291f7 100644 --- a/import.Rmd +++ b/import.Rmd @@ -567,10 +567,20 @@ This makes csvs a little unreliable for caching interim results - you need to re 1. The feather package implements a fast binary file format that can be shared across programming languages: - ```{r} + ```{r, eval = FALSE} library(feather) write_feather(challenge, "challenge.feather") read_feather("challenge.feather") + #> # A tibble: 2,000 x 2 + #> x y + #> + #> 1 404 + #> 2 4172 + #> 3 3004 + #> 4 787 + #> 5 37 + #> 6 2332 + #> # ... with 1,994 more rows ``` feather tends to be faster than rds and is usable outside of R. `rds` supports list-columns (which you'll learn about in [[Many models]]), which feather does not yet. @@ -578,19 +588,24 @@ feather tends to be faster than rds and is usable outside of R. `rds` supports l ```{r, include = FALSE} file.remove("challenge-2.csv") file.remove("challenge.rds") -file.remove("challenge.feather") ``` ## Other types of data -We have worked on a number of packages to make importing data into R as easy as possible. These packages are certainly not perfect, but they are the best place to start because they behave as similar as possible to readr. +To get other types of data into R, we recommend starting with the packages listed below. They're certainly not perfect, but they are a good place to start as they are fully fledged members of the tidyverse. -Two packages helper +For rectanuglar data: -* haven reads files from other SPSS, Stata, and SAS files. +* haven reads SPSS, Stata, and SAS files. * readxl reads excel files (both `.xls` and `.xlsx`). -There are two common forms of hierarchical data: XML and json. We recommend using xml2 and jsonlite respectively. These packages are performant, safe, and (relatively) easy to use. To work with these effectively in R, you'll need to x +* DBI, along with a database specific backend (e.g. RMySQL, RSQLite, + RPostgreSQL etc) allows you to run SQL queries against a database + and return a data frame. -If your data lives in a database, you'll need to use the DBI package. DBI provides a common interface that works with many different types of database. R's support is particularly good for open source databases (e.g. RPostgres, RMySQL, RSQLite, MonetDBLite). +For hierarchical data: + +* jsonlite (by Jeroen Ooms) reads json + +* xml2 reads XML.