Tibble is always attached now

This commit is contained in:
hadley 2016-10-03 14:10:05 -05:00
parent c989bae1a6
commit 30509621cf
10 changed files with 32 additions and 32 deletions

View File

@ -548,7 +548,7 @@ There are a few other general strategies to help you parse files:
frame.
```{r}
df <- tibble::tibble(
df <- tibble(
x = c("1", "2", "3"),
y = c("1.21", "2.32", "4.56")
)

View File

@ -31,7 +31,7 @@ library(tidyverse)
Imagine we have this simple tibble:
```{r}
df <- tibble::tibble(
df <- tibble(
a = rnorm(10),
b = rnorm(10),
c = rnorm(10),
@ -172,7 +172,7 @@ There are four variations on the basic theme of the for loop:
Sometimes you want to use a for loop to modify an existing object. For example, remember our challenge from [functions]. We wanted to rescale every column in a data frame:
```{r}
df <- tibble::tibble(
df <- tibble(
a = rnorm(10),
b = rnorm(10),
c = rnorm(10),
@ -373,7 +373,7 @@ For loops are not as important in R as they are in other languages because R is
To see why this is important, consider (again) this simple data frame:
```{r}
df <- tibble::tibble(
df <- tibble(
a = rnorm(10),
b = rnorm(10),
c = rnorm(10),
@ -781,7 +781,7 @@ knitr::include_graphics("diagrams/lists-pmap-named.png")
Since the arguments are all the same length, it makes sense to store them in a data frame:
```{r}
params <- tibble::tribble(
params <- tribble(
~mean, ~sd, ~n,
5, 1, 1,
10, 5, 3,
@ -818,7 +818,7 @@ knitr::include_graphics("diagrams/lists-invoke.png")
The first argument is a list of functions or character vector of function names. The second argument is a list of lists giving the arguments that vary for each function. The subsequent arguments are passed on to every function.
And again, you can use `tibble::tribble()` to make creating these matching pairs a little easier:
And again, you can use `tribble()` to make creating these matching pairs a little easier:
```{r, eval = FALSE}
sim <- tribble(
@ -918,12 +918,12 @@ Sometimes you have a complex list that you want to reduce to a simple list by re
```{r}
dfs <- list(
age = tibble::tibble(name = "John", age = 30),
sex = tibble::tibble(name = c("John", "Mary"), sex = c("M", "F")),
trt = tibble::tibble(name = "Mary", treatment = "A")
age = tibble(name = "John", age = 30),
sex = tibble(name = c("John", "Mary"), sex = c("M", "F")),
trt = tibble(name = "Mary", treatment = "A")
)
dfs %>% reduce(dplyr::full_join)
dfs %>% reduce(full_join)
```
Or maybe you have a list of vectors, and want to find the intersection:
@ -970,7 +970,7 @@ x %>% accumulate(`+`)
But it has a number of bugs as illustrated with the following inputs:
```{r, eval = FALSE}
df <- tibble::tibble(
df <- tibble(
x = 1:3,
y = 3:1,
z = c("a", "b", "c")

View File

@ -225,7 +225,7 @@ Both the boostrap and cross-validation are build on top of a "resample" object.
These functions return an object of class "resample", which represents the resample in a memory efficient way. Instead of storing the resampled dataset itself, it instead stores the integer indices, and a "pointer" to the original dataset. This makes resamples take up much less memory.
```{r}
x <- resample_bootstrap(tibble::as_tibble(mtcars))
x <- resample_bootstrap(as_tibble(mtcars))
class(x)
x
@ -268,7 +268,7 @@ When you start dealing with many models, it's helpful to have some rough way of
One way to capture the quality of the model is to summarise the distribution of the residuals. For example, you could look at the quantiles of the absolute residuals. For this dataset, 25% of predictions are less than \$7,400 away, and 75% are less than \$25,800 away. That seems like quite a bit of error when predicting someone's income!
```{r}
heights <- tibble::as_tibble(readRDS("data/heights.RDS"))
heights <- tibble(readRDS("data/heights.RDS"))
h <- lm(income ~ height, data = heights)
h

View File

@ -324,7 +324,7 @@ You've seen formulas before when using `facet_wrap()` and `facet_grid()`. In R,
The majority of modelling functions in R use a standard conversion from formulas to functions. You've seen one simple conversion already: `y ~ x` is translated to `y = a_1 + a_2 * x`. If you want to see what R actually does, you can use the `model_matrix()` function. It takes a data frame and a formula and returns a tibble that defines the model equation: each column in the output is associated with one coefficient in the model, the function is always `y = a_1 * out1 + a_2 * out_2`. For the simplest case of `y ~ x1` this shows us something interesting:
```{r}
df <- tibble::tribble(
df <- tribble(
~y, ~x1, ~x2,
4, 2, 5,
5, 1, 6
@ -353,7 +353,7 @@ The following sections expand on how this formula notation works for categorcal
Generating a function from a formula is straight forward when the predictor is continuous, but things get a bit more complicated when the predictor is categorical. Imagine you have a formula like `y ~ sex`, where sex could either be male or female. It doesn't make sense to convert that to a formula like `y = x_0 + x_1 * sex` because `sex` isn't a number - you can't multiply it! Instead what R does is convert it to `y = x_0 + x_1 * sex_male` where `sex_male` is one if `sex` is male and zero otherwise:
```{r, echo = FALSE}
df <- tibble::tribble(
df <- tribble(
~ sex, ~ response,
"male", 1,
"female", 2,
@ -665,7 +665,7 @@ sim6 %>%
Missing values obviously can not convey any information about the relationship between the variables, so modelling functions will drop any rows that contain missing values. R's default behaviour is to silently drop them, but `options(na.action = na.warn)` (run in the prerequisites), makes sure you get a warning.
```{r}
df <- tibble::frame_data(
df <- tribble(
~x, ~y,
1, 2.2,
2, NA,

View File

@ -368,7 +368,7 @@ df %>%
Another example of this pattern is using the `map()`, `map2()`, `pmap()` from purrr. For example, we could take the final example from [Invoking different functions] and rewrite it to use `mutate()`:
```{r}
sim <- tibble::tribble(
sim <- tribble(
~f, ~params,
"runif", list(min = -1, max = -1),
"rnorm", list(sd = 5),
@ -420,7 +420,7 @@ x <- list(
c = 5:6
)
df <- tibble::enframe(x)
df <- enframe(x)
df
```

View File

@ -352,7 +352,7 @@ rmarkdown::render("fuel-economy.Rmd", params = list(my_class = "suv"))
This is particularly powerful in conjunction with `purrr:pwalk()`. The following example creates a report for each value of `class` found in `mpg`.
```{r, eval = FALSE}
reports <- tibble::tibble(
reports <- tibble(
class = unique(mpg$class),
filename = stringr::str_c("fuel-economy-", class, ".html"),
params = purrr::map(class, ~ list(my_class = .))

View File

@ -205,7 +205,7 @@ As you might have guessed from the common `key` and `value` arguments, `spread()
Carefully consider the following example:
```{r, eval = FALSE}
stocks <- tibble::tibble(
stocks <- tibble(
year = c(2015, 2015, 2016, 2016),
half = c( 1, 2, 1, 2),
return = c(1.88, 0.59, 0.92, 0.17)
@ -231,7 +231,7 @@ As you might have guessed from the common `key` and `value` arguments, `spread()
the problem?
```{r}
people <- tibble::tribble(
people <- tribble(
~name, ~key, ~value,
#-----------------|--------|------
"Phillip Woods", "age", 45,
@ -246,7 +246,7 @@ As you might have guessed from the common `key` and `value` arguments, `spread()
What are the variables?
```{r}
preg <- tibble::tribble(
preg <- tribble(
~pregnant, ~male, ~female,
"yes", NA, 10,
"no", 20, 12
@ -329,10 +329,10 @@ table5 %>%
Experiment with the various options for the following two toy datasets.
```{r, eval = FALSE}
tibble::tibble(x = c("a,b,c", "d,e,f,g", "h,i,j")) %>%
tibble(x = c("a,b,c", "d,e,f,g", "h,i,j")) %>%
separate(x, c("one", "two", "three"))
tibble::tibble(x = c("a,b,c", "d,e", "f,g,i")) %>%
tibble(x = c("a,b,c", "d,e", "f,g,i")) %>%
separate(x, c("one", "two", "three"))
```
@ -352,7 +352,7 @@ Changing the representation of a dataset brings up an important subtlety of miss
Let's illustrate this idea with a very simple data set:
```{r}
stocks <- tibble::tibble(
stocks <- tibble(
year = c(2015, 2015, 2015, 2015, 2016, 2016, 2016),
qtr = c( 1, 2, 3, 4, 2, 3, 4),
return = c(1.88, 0.59, 0.35, NA, 0.92, 0.17, 2.66)
@ -396,7 +396,7 @@ stocks %>%
There's one other important tool that you should know for working with missing values. Sometimes when a data source has primarily been used for data entry, missing values indicate that the previous value should be carried forward:
```{r}
treatment <- tibble::tribble(
treatment <- tribble(
~ person, ~ treatment, ~response,
"Derrick Whitmore", 1, 7,
NA, 2, 10,

View File

@ -625,7 +625,7 @@ When I plot the skill of the batter (measured by the batting average, `ba`) agai
```{r}
# Convert to a tibble so it prints nicely
batting <- tibble::as_tibble(Lahman::Batting)
batting <- as_tibble(Lahman::Batting)
batters <- batting %>%
group_by(playerID) %>%

View File

@ -268,11 +268,11 @@ Here, R will expand the shortest vector to the same length as the longest, so ca
While vector recycling can be used to create very succinct, clever code, it can also silently conceal problems. For this reason, the vectorised functions in tidyverse will throw errors when you recycle anything other than a scalar. If you do want to recycle, you'll need to do it yourself with `rep()`:
```{r, error = TRUE}
tibble::tibble(x = 1:4, y = 1:2)
tibble(x = 1:4, y = 1:2)
tibble::tibble(x = 1:4, y = rep(1:2, 2))
tibble(x = 1:4, y = rep(1:2, 2))
tibble::tibble(x = 1:4, y = rep(1:2, each = 2))
tibble(x = 1:4, y = rep(1:2, each = 2))
```
### Naming vectors
@ -286,7 +286,7 @@ c(x = 1, y = 2, z = 4)
Or after the fact with `purrr::set_names()`:
```{r}
purrr::set_names(1:3, c("a", "b", "c"))
set_names(1:3, c("a", "b", "c"))
```
Named vectors are most useful for subsetting, described next.

View File

@ -498,7 +498,7 @@ Stats are the most subtle part of plotting because you can't see them directly.
me map the height of the bars to the raw values of a $y$ variable.
```{r}
demo <- tibble::tibble(
demo <- tibble(
a = c("bar_1", "bar_2", "bar_3"),
b = c(20, 30, 40)
)