Few more style notes

2022-02-22 17:49:31 -06:00 · 2022-02-22 17:49:31 -06:00 · 6560aa1d56
parent 1029045076
commit 6560aa1d56
1 changed files with 41 additions and 7 deletions
--- a/workflow-style.Rmd
+++ b/workflow-style.Rmd
@ -28,6 +28,11 @@ Figure \@ref(fig:styler) shows the results.
 knitr::include_graphics("screenshots/rstudio-palette.png")
 ```

+```{r setup}
+library(tidyverse)
+library(nycflights13)
+```
+
 ## Names

 Variable names (those created by `<-` and those created by `mutate()`) should use only lowercase letters, numbers, and `_`.
@ -35,7 +40,7 @@ Use underscores (`_`) to separate words within a name.

 ```{r, eval = FALSE}
 # Strive for:
-short_flights <- flights |> filter(airtime < 60)
+short_flights <- flights |> filter(air_time < 60)

 # Avoid:

@ -44,9 +49,13 @@ short_flights <- flights |> filter(airtime < 60)
 As a general rule of thumb, it's better to prefer long, descriptive names that are easy to understand, rather than concise names that are fast to type.
 Short names save relatively little time when writing code (especially since autocomplete will help you finish typing them), but can be expensive when you come back to old need and need to puzzle out what a cryptic abbreviation means.

+Strive for consistency with your names.
+
+Prefixes are generally better than suffixes because of autocomplete.
+
 ## Spaces

-Put spaces on either side of mathematical operators (e.g `+`, `-`, `==`, `<` ; but not `^`) and the assignment operator (`<-`).
+Put spaces on either side of mathematical operators apart from `^` (e.g `+`, `-`, `==`, `<`), and around the assignment operator (`<-`).
 Don't put spaces inside or outside parentheses for regular function calls.
 Always put a space after a comma, just like in regular English.

@ -60,21 +69,24 @@ mean(x, na.rm = TRUE)
 mean (x ,na.rm=TRUE)
 ```

-It's OK to add extra spaces if it improves alignment of `=:`
+It's OK to add extra spaces if it improves alignment.
+For example, if you're creating multiple variables in `mutate()`, you might want to add spaces so that all the `=` line up.
+This is particularly useful in `case_when()` as it makes it easier to skim the conditions and the values.

 ```{r, eval = FALSE}
 flights |> 
  mutate(
    speed      = air_time / distance,
    dep_hour   = dep_time %/% 100,
-    dep_minute = dep_time %% 100
+    dep_minute = dep_time %%  100
  )
 ```

 ## Pipes

-`|>` should always have a space after it and should usually be followed by a new line.
+`|>` should always have a space before it and should always be followed by a new line or space (usually a new line).
 After the first step, each line should be indented by two spaces.
+
 If the function has named arguments (like `mutate()` or `summarise()`) then put each argument on a new line, indented by another two spaces.
 Make sure the closing parentheses start a new line and are lined up with the start of the function name.

@ -116,12 +128,34 @@ The same basic rules apply to ggplot2, just treat `+` the same way as `|>`.
 ```{r, eval = FALSE}
 flights |> 
  group_by(month) |> 
-  summarise(delay = mean(arr_delay, na.rm = TRUE)) |> 
+  summarise(
+    delay = mean(arr_delay, na.rm = TRUE)
+  ) |> 
  ggplot(aes(month, delay)) +
  geom_point() + 
  geom_line()
 ```

+If you can fit all of the arguments on to a single line, put each argument on its own line.
+
+```{r, eval = FALSE}
+flights |> 
+  group_by(dest) |> 
+  summarise(
+    distance = mean(distance),
+    speed = mean(air_time / distance, na.rm = TRUE)
+  ) |> 
+  ggplot(aes(distance, speed)) +
+  geom_smooth(
+    method = "loess",
+    span = 0.5,
+    se = FALSE, 
+    colour = "white", 
+    size = 4
+  ) +
+  geom_point()
+```
+
 Be wary of writing very long pipes, say longer than 10-15 lines.
 Try to break them up into smaller sub-tasks, giving each task an informative name.
 The names will help cue the reader into what's happening and makes it easier to check that intermediate results are as expected.
@ -131,7 +165,7 @@ This means breaking up long pipelines if there are intermediate states that can

 ## Organisation

-Where possible, use comments to explain the "why" of your code, not the "how" or the "what".
+Use comments to explain the "why" of your code, not the "how" or the "what".
 If you simply describe what your code is doing in prose, you'll have to be careful to update the comment and code in tandem: if you change the code and forget to update the comment, they'll be inconsistent which will lead to confusion when you come back to your code in the future.
 For data analysis code, use comments to explain your overall plan of attack and record important insight as you encounter them.
 There's way to re-capture this knowledge from the code itself.