Style brain dump

This commit is contained in:
Hadley Wickham 2022-02-16 15:51:33 -06:00
parent 1b2a1b4b35
commit fb11736b8c
1 changed files with 30 additions and 30 deletions

View File

@ -5,25 +5,35 @@ status("drafting")
```
Good coding style is like correct punctuation: you can manage without it, butitsuremakesthingseasiertoread.
Even as a very new programmer it's a good idea to work on your code style.
Use a consistent style makes it easier for others (including future-you!) to read your work, and is particularly important if you need to get help from someone else.
<https://style.tidyverse.org>, <https://style.tidyverse.org/pipes.html>
Styling your code will feel a bit tedious at the start, but if you practice it, it will soon become second nature.
Additionally, there are some great tools available like the [styler](http://styler.r-lib.org) package which can get you 90% of the way there with a touch of a button.
It's highly recommended to regularly spend some time just working on the clarity of your code.
The results might be exactly the same but it's not wasted effort: when you come back to the code in the future, you'll find it easier to remember what you did and easy to adapt to new demands.
- Variables should use only lowercase letters, numbers, and `_`.
Here I'll introduce you to the high points parts of the [tidyverse style guide](https://style.tidyverse.org).
I highly recommend you consult the full style guide if you have more questions as it goes into much more detail.
- Variable names should use only lowercase letters, numbers, and `_`.
Use underscores (`_`) (so called snake case) to separate words within a name.
Better to use overly long names than overly short.
Autocomplete means that
- Do not put spaces inside or outside parentheses for regular function calls.
Always put a space after a comma, never before, just like in regular English.
As a general rule of thumb, it's better to err on the side of overly long description names than concise names that are fast to type.
Short names save relatively little time when writing code (especially since autocomplete will often help you finish a long variable name), but will suck up time when you re-read code in the future and have to wrack your memory for what that now cryptic abbreviation means.
- Most operators ([`==`](https://rdrr.io/r/base/Comparison.html), [`+`](https://rdrr.io/r/base/Arithmetic.html), [`-`](https://rdrr.io/r/base/Arithmetic.html), [`<-`](https://rdrr.io/r/base/assignOps.html), etc.) should be surrounded by spaces; the chief exception is `^`.
- Put spaces on either side of mathematical operators (e.g `+`, `-`, `==`, `<` ; but not `^`) and the assignment operator (`<-`).
Don't put spaces inside or outside parentheses for regular function calls.
Always put a space after a comma, just like in regular English.
- `%>%` should always have a space before it, and should usually be followed by a new line.
It's ok to add extra spaces if it improves alignment of [`=`](https://rdrr.io/r/base/assignOps.html).
- `|>` should always have a space after and should usually be followed by a new line.
After the first step, each line should be indented by two spaces.
This structure makes it easier to add new steps (or rearrange existing steps) and harder to overlook a step.
- In a pipeline, put each function on its on line.
And if the function as named arguments (`=`) then put each of those on a single line.
If the function as named arguments (like `mutate()` or `summarise()`) then put each argument on a new line, indented by another two spaces.
Make sure the closing parentheses start a new line and are lined up with the start of the function name.
```{r, eval = FALSE}
df |> mutate(y = x + 1)
@ -34,34 +44,24 @@ Good coding style is like correct punctuation: you can manage without it, butits
)
```
Line up the opening and closing parens.
The same basic rules apply to ggplot2, just treat `+` the same way as `|>`.
- It's ok to add extra spaces if it improves alignment of [`=`](https://rdrr.io/r/base/assignOps.html).
```{r}
df |>
ggplot(aes())
```
- For very short snippets that fit on one line, it's ok to write (e.g.) `mutate(df, y = x + 1)` vs `df %>% mutate(df, y = x + 1)`.
But in my experience, short snippets often grow longer, so you'll save time in the long run but start out how you wish to continue.
- It's ok to skip these rules if your snippet is fits easily on one line (e.g.) `mutate(df, y = x + 1)` or `df %>% mutate(df, y = x + 1)`.
But it's pretty common for short snippets to grow longer, so you'll save time in the long run by starting out as you wish to continue.
- Use empty lines to organize your code into "paragraphs" of related thoughts.
- with ggplot2
```{r, eval = FALSE}
df |>
ggplot(aes())
```
Don't forget to switch to plus!
- How long should your pipes be?
Too long vs too short.
Your pipes are longer than (say) ten steps.
In that case, create intermediate objects with meaningful names.
That will make debugging easier, because you can more easily check the intermediate results, and it makes it easier to understand your code, because the variable names can help communicate intent.
- Be wary of writing very long pipes, say longer than 10-15 lines.
Try to break them up into logical subtasks, giving each part an informative name.
The names will help cue the reader into what's happening and gives convenient places to check that intermediate results are as expected.
- Whenever you can give something an informative name, you should give it an informative name.
Don't expect to get it right the first time!
It's highly recommended to regularly spend some time just working on the clarity of your code.
The results might be exactly the same but it's not wasted effort: when you come back to the code in the future, you'll find it easier to remember what you did and easy to adapt to new demands.
- Strive to limit your code to 80 characters per line.
This fits comfortably on a printed page with a reasonably sized font.