Function polishing
This commit is contained in:
parent
765d1c8191
commit
8078a9c0f7
|
@ -4,7 +4,7 @@
|
||||||
#| results: "asis"
|
#| results: "asis"
|
||||||
#| echo: false
|
#| echo: false
|
||||||
source("_common.R")
|
source("_common.R")
|
||||||
status("drafting")
|
status("polishing")
|
||||||
```
|
```
|
||||||
|
|
||||||
## Introduction
|
## Introduction
|
||||||
|
@ -597,8 +597,6 @@ diamonds |> count_wide(c(clarity, color), cut)
|
||||||
|
|
||||||
While our examples have mostly focused on dplyr, the tidy evaluation also underpins tidyr, and if you look at the `pivot_wider()` docs you can see that `names_from` uses tidy-selection.
|
While our examples have mostly focused on dplyr, the tidy evaluation also underpins tidyr, and if you look at the `pivot_wider()` docs you can see that `names_from` uses tidy-selection.
|
||||||
|
|
||||||
### Learning more
|
|
||||||
|
|
||||||
### Exercises
|
### Exercises
|
||||||
|
|
||||||
## Plot functions
|
## Plot functions
|
||||||
|
@ -752,36 +750,32 @@ The only advantage of this syntax is that `vars()` uses tidy evaluation so you c
|
||||||
```{r}
|
```{r}
|
||||||
# https://twitter.com/sharoz/status/1574376332821204999
|
# https://twitter.com/sharoz/status/1574376332821204999
|
||||||
|
|
||||||
# Facetting is fiddly - have to use special vars syntax.
|
|
||||||
foo <- function(x) {
|
foo <- function(x) {
|
||||||
ggplot(mtcars) +
|
ggplot(mtcars, aes(mpg, disp)) +
|
||||||
aes(x = mpg, y = disp) +
|
|
||||||
geom_point() +
|
geom_point() +
|
||||||
facet_wrap(vars({{ x }}))
|
facet_wrap(vars({{ x }}))
|
||||||
}
|
}
|
||||||
foo(cyl)
|
foo(cyl)
|
||||||
```
|
```
|
||||||
|
|
||||||
As with data frame functions, it can also be useful to make your plotting functions tightly coupled to a specific dataset, or even a specific variable.
|
As with data frame functions, it can be useful to make your plotting functions tightly coupled to a specific dataset, or even a specific variable.
|
||||||
The following function makes it particularly easy to interactively explore the conditional distribution `bill_length_mm` from palmerpenguins dataset.
|
For example, the following function makes it particularly easy to interactively explore the conditional distribution `bill_length_mm` from palmerpenguins dataset.
|
||||||
|
|
||||||
```{r}
|
```{r}
|
||||||
# https://twitter.com/yutannihilat_en/status/1574387230025875457
|
# https://twitter.com/yutannihilat_en/status/1574387230025875457
|
||||||
density <- function(fill, facets) {
|
density <- function(colour, facets, binwidth = 0.1) {
|
||||||
palmerpenguins::penguins |>
|
diamonds |>
|
||||||
ggplot(aes(bill_length_mm, fill = {{ fill }})) +
|
ggplot(aes(carat, after_stat(density), colour = {{ colour }})) +
|
||||||
geom_density(alpha = 0.5) +
|
geom_freqpoly(binwidth = binwidth) +
|
||||||
facet_wrap(vars({{ facets }}))
|
facet_wrap(vars({{ facets }}))
|
||||||
}
|
}
|
||||||
|
|
||||||
density()
|
density()
|
||||||
density(species)
|
density(cut)
|
||||||
density(island, sex)
|
density(cut, clarity)
|
||||||
```
|
```
|
||||||
|
|
||||||
Also note that we hardcoded the `x` variable but allowed the fill to vary.
|
### Labeling
|
||||||
|
|
||||||
### Labelling
|
|
||||||
|
|
||||||
Remember the histogram function we showed you earlier?
|
Remember the histogram function we showed you earlier?
|
||||||
|
|
||||||
|
@ -794,12 +788,12 @@ histogram <- function(df, var, binwidth = NULL) {
|
||||||
```
|
```
|
||||||
|
|
||||||
Wouldn't it be nice if we could label the output with the variable and the bin width that was used?
|
Wouldn't it be nice if we could label the output with the variable and the bin width that was used?
|
||||||
To do so, we're going to have to go under the covers of tidy evaluation and use a function from a new package: rlang.
|
To do so, we're going to have to go under the covers of tidy evaluation and use a function from package we haven't talked about before: rlang.
|
||||||
rlang is a low-level package that's used by just about every other package in the tidyverse because it implements tidy evaluation (and provided many other useful tools).
|
rlang is a low-level package that's used by just about every other package in the tidyverse because it implements tidy evaluation (as well as many other useful tools).
|
||||||
|
|
||||||
To solve the labelling problem we can use `rlang::englue()`.
|
To solve the labeling problem we can use `rlang::englue()`.
|
||||||
This works similarly to `str_glue()`, so any value wrapped in `{ }` will be inserted into the string.
|
This works similarly to `str_glue()`, so any value wrapped in `{ }` will be inserted into the string.
|
||||||
But unlike `str_glue()`, it also understands `{{ }}`, which automatically insert the appropriate variable name.
|
But it also understands `{{ }}`, which automatically insert the appropriate variable name:
|
||||||
|
|
||||||
```{r}
|
```{r}
|
||||||
histogram <- function(df, var, binwidth) {
|
histogram <- function(df, var, binwidth) {
|
||||||
|
@ -814,27 +808,19 @@ histogram <- function(df, var, binwidth) {
|
||||||
diamonds |> histogram(carat, 0.1)
|
diamonds |> histogram(carat, 0.1)
|
||||||
```
|
```
|
||||||
|
|
||||||
(Note that if you omit the `binwidth` the function fails with a weird error. That appears to be a bug in `englue()`: https://github.com/r-lib/rlang/issues/1492.
|
|
||||||
Hopefully it'll be fixed soon!)
|
|
||||||
|
|
||||||
You can use the same approach any other place that you might supply a string in a ggplot2 plot.
|
You can use the same approach any other place that you might supply a string in a ggplot2 plot.
|
||||||
|
|
||||||
### Exercises
|
### Exercises
|
||||||
|
|
||||||
## Style
|
## Style
|
||||||
|
|
||||||
It's important to remember that functions are not just for the computer, but are also for humans.
|
R doesn't care what your function or arguments are called but the names make a big difference for humans.
|
||||||
R doesn't care what your function is called, or what comments it contains, but these are important for human readers.
|
|
||||||
This section discusses some things that you should bear in mind when writing functions that humans can understand.
|
|
||||||
|
|
||||||
The name of a function is important.
|
|
||||||
Ideally, the name of your function will be short, but clearly evoke what the function does.
|
Ideally, the name of your function will be short, but clearly evoke what the function does.
|
||||||
That's hard!
|
That's hard!
|
||||||
But it's better to be clear than short, as RStudio's autocomplete makes it easy to type long names.
|
But it's better to be clear than short, as RStudio's autocomplete makes it easy to type long names.
|
||||||
|
|
||||||
Generally, function names should be verbs, and arguments should be nouns.
|
Generally, function names should be verbs, and arguments should be nouns.
|
||||||
There are some exceptions: nouns are ok if the function computes a very well known noun (i.e. `mean()` is better than `compute_mean()`), or accessing some property of an object (i.e. `coef()` is better than `get_coefficients()`).
|
There are some exceptions: nouns are ok if the function computes a very well known noun (i.e. `mean()` is better than `compute_mean()`), or accessing some property of an object (i.e. `coef()` is better than `get_coefficients()`).
|
||||||
A good sign that a noun might be a better choice is if you're using a very broad verb like "get", "compute", "calculate", or "determine".
|
|
||||||
Use your best judgement and don't be afraid to rename a function if you figure out a better name later.
|
Use your best judgement and don't be afraid to rename a function if you figure out a better name later.
|
||||||
|
|
||||||
```{r}
|
```{r}
|
||||||
|
@ -851,8 +837,9 @@ impute_missing()
|
||||||
collapse_years()
|
collapse_years()
|
||||||
```
|
```
|
||||||
|
|
||||||
In terms of white space, continue to follow the rules from @sec-workflow-style.
|
R also doesn't care about how you use white space in your functions but future readers will.
|
||||||
Additionally, `function` should always be followed by squiggly brackets (`{}`), and the contents should be indented by an additional two spaces.
|
Continue to follow the rules from @sec-workflow-style.
|
||||||
|
Additionally, `function()` should always be followed by squiggly brackets (`{}`), and the contents should be indented by an additional two spaces.
|
||||||
This makes it easier to see the hierarchy in your code by skimming the left-hand margin.
|
This makes it easier to see the hierarchy in your code by skimming the left-hand margin.
|
||||||
|
|
||||||
```{r}
|
```{r}
|
||||||
|
@ -874,10 +861,8 @@ pull_unique <- function(df, var) {
|
||||||
pull_unique <- function(df, var) df |> distinct({{ var }}) |> pull({{ var }})
|
pull_unique <- function(df, var) df |> distinct({{ var }}) |> pull({{ var }})
|
||||||
```
|
```
|
||||||
|
|
||||||
As you can see from the example we recommend putting extra spaces inside of `{{ }}`.
|
As you can see we recommend putting extra spaces inside of `{{ }}`.
|
||||||
This makes it super obvious that something unusual is happening.
|
This makes it very obvious that something unusual is happening.
|
||||||
|
|
||||||
Learn more at <https://style.tidyverse.org/functions.html>
|
|
||||||
|
|
||||||
### Exercises
|
### Exercises
|
||||||
|
|
||||||
|
@ -902,14 +887,12 @@ Learn more at <https://style.tidyverse.org/functions.html>
|
||||||
In this chapter you learned how to write functions for three useful scenarios: creating a vector, creating a data frames, or creating a plot.
|
In this chapter you learned how to write functions for three useful scenarios: creating a vector, creating a data frames, or creating a plot.
|
||||||
Along the way your saw many examples, which hopefully started to get your creative juices flowing, and gave you some ideas for where functions might help your analysis code.
|
Along the way your saw many examples, which hopefully started to get your creative juices flowing, and gave you some ideas for where functions might help your analysis code.
|
||||||
|
|
||||||
You also learned a little about tidy evaluation so you could wrap functions from dplyr, tidyr, and ggplot2.
|
We have only shown you the bare minimum to get started with functions and there's much more to learn.
|
||||||
Tidy evaluation is a key component of the tidyverse because it allows you to write `diamonds |> filter(x == y)` and `filter()` knows to use `x` and `y` from the diamonds dataset.
|
A few places to learn more are:
|
||||||
The downside of tidy evaluation is that you need to learn a new technique for programming: embracing, `{{ x }}`.
|
|
||||||
Embracing already gives you considerable power to reduce duplication in your data analyses, but there are many more advanced techniques available, which you can learn more about it `vignette("programming", package = "dplyr")` and `vignette("programming", package = "tidyr")`.
|
|
||||||
|
|
||||||
Here we've focused on very simple plotting functions, the sort of functions that you might naturally extract from repeated code in your analyses.
|
- To learn more about programming with tidy evaluation, see useful recipes in `vignette("programming", package = "dplyr")` and `vignette("programming", package = "tidyr")` and learn more about the theory in <https://rlang.r-lib.org/reference/topic-data-mask.html>.
|
||||||
As you get better at programming and learn more about ggplot2, you'll be able create richer functions with greater flexibility.
|
- To learn more about reducing duplication in your ggplot2 code, read the [Programming with ggplot2](https://ggplot2-book.org/programming.html){.uri} chapter of the ggplot2 book.
|
||||||
The next place you might stop on your journey is the [Programming with ggplot2](https://ggplot2-book.org/programming.html){.uri} chapter of the ggplot2 book, where you'll learn other ways to reduce duplication in your plotting code.
|
- To learn more about good function style, read <https://style.tidyverse.org/functions.html>.
|
||||||
|
|
||||||
In the next chapter, we'll dive into some of the details of R's vector data structures that we've omitted so far.
|
In the next chapter, we'll dive into some of the details of R's vector data structures that we've omitted so far.
|
||||||
These are immediately useful by themselves, but are a necessary foundation for the following chapter on iteration that provides some amazingly powerful tools.
|
These are not immediately useful by themselves, but are a necessary foundation for the following chapter on iteration which gives you further tools for reducing code duplication.
|
||||||
|
|
Loading…
Reference in New Issue