Add exercises. Update adverbs
This commit is contained in:
parent
269867d60c
commit
445b1a0748
41
lists.Rmd
41
lists.Rmd
|
@ -228,6 +228,24 @@ compute_summary(x, mean)
|
|||
|
||||
Instead of hardcoding the summary function, we allow it to vary, by adding an addition argument that is a function. It can take a while to wrap your head around this, but it's very powerful technique. This is one of the reasons that R is known as a "functional" programming language.
|
||||
|
||||
### Exercises
|
||||
|
||||
1. Read the documentation for `apply()`. In the 2d case, what two for loops
|
||||
does it generalise?
|
||||
|
||||
1. It's common to see for loops that don't preallocate the output and instead
|
||||
increase the length of a vector at each step:
|
||||
|
||||
```{r}
|
||||
results <- vector("integer", 0)
|
||||
for (i in seq_along(x)) {
|
||||
results <- results(c, results)
|
||||
}
|
||||
results
|
||||
```
|
||||
|
||||
How does this impact performance?
|
||||
|
||||
## The map functions
|
||||
|
||||
This pattern of looping over a list and doing something to each element is so common that the purrr package provides a family of functions to do it for you. Each function always returns the same type of output so there are six variations based on what sort of result you want:
|
||||
|
@ -304,14 +322,25 @@ If you're familiar with the apply family of functions in base R, you might have
|
|||
`map_lgl(df, is.numeric)`. One advantage to `vapply()` over the map
|
||||
functions is that it can also produce matrices.
|
||||
|
||||
### Exercises
|
||||
|
||||
1. How can you determine which columns in a data frame are factors?
|
||||
(Hint: data frames are lists.)
|
||||
|
||||
1. What happens when you use the map functions on vectors that aren't lists?
|
||||
What does `map(1:5, runif)` do? Why?
|
||||
|
||||
1. What does `map(-2:2, rnorm, n = 5)` do. Why?
|
||||
|
||||
## Pipelines
|
||||
|
||||
`map()` is particularly useful when constructing more complex transformations because it both inputs and outputs a list. That makes it well suited for solving a problem a piece at a time. For example, imagine you want to fit a linear model to each individual in a dataset.
|
||||
`map()` is particularly useful when constructing more complex transformations because it both inputs and outputs a list. That makes it well suited for solving a problem a piece at a time.
|
||||
|
||||
Let's start by working through the whole process on the complete dataset. It's always a good idea to start simple (with a single object), and figure out the basic workflow. Then you can generalise up to the harder problem of applying the same steps to multiple models.
|
||||
|
||||
TODO: find interesting dataset
|
||||
|
||||
For example, imagine you want to fit a linear model to each individual in a dataset. Let's start by working through the whole process on the complete dataset. It's always a good idea to start simple (with a single object), and figure out the basic workflow. Then you can generalise up to the harder problem of applying the same steps to multiple models.
|
||||
|
||||
You could start by creating a list where each element is a data frame for a different person:
|
||||
|
||||
```{r}
|
||||
|
@ -407,12 +436,12 @@ Other predicate functionals: `head_while()`, `tail_while()`, `some()`, `every()`
|
|||
|
||||
When you start doing many operations with purrr, you'll soon discover that not everything always succeeds. For example, you might be fitting a bunch of more complicated models, and not every model will converge. How do you ensure that one bad apple doesn't ruin the whole barrel?
|
||||
|
||||
Dealing with errors is fundamentally painful because errors are sort of a side-channel to the way that functions usually return values. The best way to handle them is to turn them into a regular output with the `safe()` function. This function is similar to the `try()` function in base R, but instead of sometimes returning the original output and sometimes returning a error, `safe()` always returns the same type of object: a list with elements `result` and `error`. For any given run, one will always be `NULL`, but because the structure is always the same its easier to deal with.
|
||||
Dealing with errors is fundamentally painful because errors are sort of a side-channel to the way that functions usually return values. The best way to handle them is to turn them into a regular output with the `safely()` function. This function is similar to the `try()` function in base R, but instead of sometimes returning the original output and sometimes returning a error, `safe()` always returns the same type of object: a list with elements `result` and `error`. For any given run, one will always be `NULL`, but because the structure is always the same its easier to deal with.
|
||||
|
||||
Let's illustrate this with a simple example: `log()`:
|
||||
|
||||
```{r}
|
||||
safe_log <- safe(log)
|
||||
safe_log <- safely(log)
|
||||
str(safe_log(10))
|
||||
str(safe_log("a"))
|
||||
```
|
||||
|
@ -459,10 +488,10 @@ dplyr::filter(all, is_ok)
|
|||
|
||||
Other related functions:
|
||||
|
||||
* `maybe()`: if you don't care about the error message, and instead
|
||||
* `possibly()`: if you don't care about the error message, and instead
|
||||
just want a default value on failure.
|
||||
|
||||
* `outputs()`: does a similar job but for other outputs like printed
|
||||
* `quietly()`: does a similar job but for other outputs like printed
|
||||
ouput, messages, and warnings.
|
||||
|
||||
Challenge: read all the csv files in this directory. Which ones failed
|
||||
|
|
Loading…
Reference in New Issue