Add exercises. Update adverbs

This commit is contained in:
hadley 2015-11-20 06:18:36 +13:00
parent 269867d60c
commit 445b1a0748
1 changed files with 35 additions and 6 deletions

View File

@ -228,6 +228,24 @@ compute_summary(x, mean)
Instead of hardcoding the summary function, we allow it to vary, by adding an addition argument that is a function. It can take a while to wrap your head around this, but it's very powerful technique. This is one of the reasons that R is known as a "functional" programming language.
### Exercises
1. Read the documentation for `apply()`. In the 2d case, what two for loops
does it generalise?
1. It's common to see for loops that don't preallocate the output and instead
increase the length of a vector at each step:
```{r}
results <- vector("integer", 0)
for (i in seq_along(x)) {
results <- results(c, results)
}
results
```
How does this impact performance?
## The map functions
This pattern of looping over a list and doing something to each element is so common that the purrr package provides a family of functions to do it for you. Each function always returns the same type of output so there are six variations based on what sort of result you want:
@ -304,14 +322,25 @@ If you're familiar with the apply family of functions in base R, you might have
`map_lgl(df, is.numeric)`. One advantage to `vapply()` over the map
functions is that it can also produce matrices.
### Exercises
1. How can you determine which columns in a data frame are factors?
(Hint: data frames are lists.)
1. What happens when you use the map functions on vectors that aren't lists?
What does `map(1:5, runif)` do? Why?
1. What does `map(-2:2, rnorm, n = 5)` do. Why?
## Pipelines
`map()` is particularly useful when constructing more complex transformations because it both inputs and outputs a list. That makes it well suited for solving a problem a piece at a time. For example, imagine you want to fit a linear model to each individual in a dataset.
`map()` is particularly useful when constructing more complex transformations because it both inputs and outputs a list. That makes it well suited for solving a problem a piece at a time.
Let's start by working through the whole process on the complete dataset. It's always a good idea to start simple (with a single object), and figure out the basic workflow. Then you can generalise up to the harder problem of applying the same steps to multiple models.
TODO: find interesting dataset
For example, imagine you want to fit a linear model to each individual in a dataset. Let's start by working through the whole process on the complete dataset. It's always a good idea to start simple (with a single object), and figure out the basic workflow. Then you can generalise up to the harder problem of applying the same steps to multiple models.
You could start by creating a list where each element is a data frame for a different person:
```{r}
@ -407,12 +436,12 @@ Other predicate functionals: `head_while()`, `tail_while()`, `some()`, `every()`
When you start doing many operations with purrr, you'll soon discover that not everything always succeeds. For example, you might be fitting a bunch of more complicated models, and not every model will converge. How do you ensure that one bad apple doesn't ruin the whole barrel?
Dealing with errors is fundamentally painful because errors are sort of a side-channel to the way that functions usually return values. The best way to handle them is to turn them into a regular output with the `safe()` function. This function is similar to the `try()` function in base R, but instead of sometimes returning the original output and sometimes returning a error, `safe()` always returns the same type of object: a list with elements `result` and `error`. For any given run, one will always be `NULL`, but because the structure is always the same its easier to deal with.
Dealing with errors is fundamentally painful because errors are sort of a side-channel to the way that functions usually return values. The best way to handle them is to turn them into a regular output with the `safely()` function. This function is similar to the `try()` function in base R, but instead of sometimes returning the original output and sometimes returning a error, `safe()` always returns the same type of object: a list with elements `result` and `error`. For any given run, one will always be `NULL`, but because the structure is always the same its easier to deal with.
Let's illustrate this with a simple example: `log()`:
```{r}
safe_log <- safe(log)
safe_log <- safely(log)
str(safe_log(10))
str(safe_log("a"))
```
@ -459,10 +488,10 @@ dplyr::filter(all, is_ok)
Other related functions:
* `maybe()`: if you don't care about the error message, and instead
* `possibly()`: if you don't care about the error message, and instead
just want a default value on failure.
* `outputs()`: does a similar job but for other outputs like printed
* `quietly()`: does a similar job but for other outputs like printed
ouput, messages, and warnings.
Challenge: read all the csv files in this directory. Which ones failed