From 55caa63edd538a04b7184bc57323476642ab1694 Mon Sep 17 00:00:00 2001 From: hadley Date: Tue, 4 Oct 2016 08:37:11 -0500 Subject: [PATCH] Tibble tweaks --- tibble.Rmd | 25 +++++++++++++++++-------- 1 file changed, 17 insertions(+), 8 deletions(-) diff --git a/tibble.Rmd b/tibble.Rmd index 656cc35..83ddd2a 100644 --- a/tibble.Rmd +++ b/tibble.Rmd @@ -2,7 +2,7 @@ ## Introduction -Throughout this book we work with "tibbles" instead of R's traditional data.frame. Tibbles _are_ data frames, but they tweak some older behaviours to make life a little easier. R is an old language, and some things that were useful 10 or 20 years ago now get in your way. It's difficult to change base R without breaking existing code, so most innovation occurs in packages. Here we will describe the __tibble__ package, which provides opinionated data frames that make working in the tidyverse a little easier. +Throughout this book we work with "tibbles" instead of R's traditional `data.frame`. Tibbles _are_ data frames, but they tweak some older behaviours to make life a little easier. R is an old language, and some things that were useful 10 or 20 years ago now get in your way. It's difficult to change base R without breaking existing code, so most innovation occurs in packages. Here we will describe the __tibble__ package, which provides opinionated data frames that make working in the tidyverse a little easier. In most places, I'll use the term tibble and data frame interchangeably; when I want to draw particular attention to R's build-in data frame, I'll call them `data.frame`s. If this chapter leaves you wanting to learn more about tibbles, you might enjoy `vignette("tibble")`. @@ -60,9 +60,9 @@ tribble( I often add a comment (the line starting with `#`), to make it really clear where the header is. -## Tibbles vs. data frames +## Tibbles vs. data.frame -There are two main differences in the usage of a data frame vs a tibble: printing and subsetting. +There are two main differences in the usage of a tibble vs. a classic `data.frame`: printing and subsetting. ### Printing @@ -125,18 +125,16 @@ df[[1]] To use these in a pipe, you'll need to use the special placeholder `.`: -```{r, include = FALSE} -library(magrittr) -``` - ```{r} df %>% .$x df %>% .[["x"]] ``` +Compared to a `data.frame`, tibbles are more strict: they never do partial matching, and they will generate a warning if the column you are trying to access does not exist. + ## Interacting with older code -Some older functions don't work with tibbles. If you encounter one of these functions, use `as.data.frame()` to turn a tibble back to a data frame: +Some older functions don't work with tibbles. If you encounter one of these functions, use `as.data.frame()` to turn a tibble back to a `data.frame`: ```{r} class(as.data.frame(tb)) @@ -149,6 +147,17 @@ The main reason that some older functions don't work with tibble is the `[` func 1. How can you tell if an object is a tibble? (Hint: try printing `mtcars`, which is a regular data frame). +1. Compare and contrast the following operations on a `data.frame` and + equivalent tibble. What is different? Why might the default data frame + behaviours cause you frustration? + + ```{r, eval = FALSE} + df <- data.frame(abc = 1, xyz = "a") + df$x + df[, "xyz"] + df[, c("abc", "xyz")] + ``` + 1. Practice referring to non-syntactic names by: 1. Plotting a scatterplot of `1` vs `2`.