133 lines
6.7 KiB
Plaintext
133 lines
6.7 KiB
Plaintext
# Workflow: Getting help {#sec-workflow-getting-help}
|
|
|
|
```{r}
|
|
#| results: "asis"
|
|
#| echo: false
|
|
source("_common.R")
|
|
status("polishing")
|
|
```
|
|
|
|
This book is not an island; there is no single resource that will allow you to master R.
|
|
As you begin to apply the techniques described in this book to your own data, you will soon find questions that we do not answer.
|
|
This section describes a few tips on how to get help, and to help you keep learning.
|
|
|
|
## Google is your friend
|
|
|
|
If you get stuck, start with Google.
|
|
Typically adding "R" to a query is enough to restrict it to relevant results: if the search isn't useful, it often means that there aren't any R-specific results available.
|
|
Google is particularly useful for error messages.
|
|
If you get an error message and you have no idea what it means, try googling it!
|
|
Chances are that someone else has been confused by it in the past, and there will be help somewhere on the web.
|
|
(If the error message isn't in English, run `Sys.setenv(LANGUAGE = "en")` and re-run the code; you're more likely to find help for English error messages.)
|
|
|
|
If Google doesn't help, try [Stack Overflow](https://stackoverflow.com).
|
|
Start by spending a little time searching for an existing answer, including `[R]` to restrict your search to questions and answers that use R.
|
|
|
|
## Making a reprex
|
|
|
|
If your googling doesn't find anything useful, it's a really good idea prepare a **reprex,** short for minimal **repr**oducible **ex**ample.
|
|
A good reprex makes it easier for other people to help you, and often you'll figure out the problem yourself in the course of making it.
|
|
There are two parts to creating a reprex:
|
|
|
|
- First, you need to make your code reproducible.
|
|
This means that you need to capture everything, i.e., include any `library()` calls and create all necessary objects.
|
|
The easiest way to make sure you've done this is to use the reprex package.
|
|
|
|
- Second, you need to make it minimal.
|
|
Strip away everything that is not directly related to your problem.
|
|
This usually involves creating a much smaller and simpler R object than the one you're facing in real life or even using built-in data.
|
|
|
|
That sounds like a lot of work!
|
|
And it can be, but it has a great payoff:
|
|
|
|
- 80% of the time creating an excellent reprex reveals the source of your problem.
|
|
It's amazing how often the process of writing up a self-contained and minimal example allows you to answer your own question.
|
|
|
|
- The other 20% of time you will have captured the essence of your problem in a way that is easy for others to play with.
|
|
This substantially improves your chances of getting help!
|
|
|
|
When creating a reprex by hand, it's easy to accidentally miss something that means your code can't be run on someone else's computer.
|
|
Avoid this problem by using the reprex package which is installed as part of the tidyverse.
|
|
Let's say you copy this code onto your clipboard (or, on RStudio Server or Cloud, select it):
|
|
|
|
```{r}
|
|
#| eval: false
|
|
|
|
y <- 1:4
|
|
mean(y)
|
|
```
|
|
|
|
Then call `reprex()`, where the default target venue is GitHub:
|
|
|
|
``` r
|
|
reprex::reprex()
|
|
```
|
|
|
|
A nicely rendered HTML preview will display in RStudio's Viewer (if you're in RStudio) or your default browser otherwise.
|
|
The relevant bit of GitHub-flavored Markdown is ready to be pasted from your clipboard (on RStudio Server or Cloud, you will need to copy this yourself):
|
|
|
|
``` r
|
|
y <- 1:4
|
|
mean(y)
|
|
#> [1] 2.5
|
|
```
|
|
|
|
Here's what that Markdown would look like rendered in a GitHub issue:
|
|
|
|
```{r}
|
|
#| eval: false
|
|
|
|
y <- 1:4
|
|
mean(y)
|
|
#> [1] 2.5
|
|
```
|
|
|
|
Anyone else can copy, paste, and run this immediately.
|
|
|
|
There are three things you need to include to make your example reproducible: required packages, data, and code.
|
|
|
|
1. **Packages** should be loaded at the top of the script, so it's easy to see which ones the example needs.
|
|
This is a good time to check that you're using the latest version of each package; it's possible you've discovered a bug that's been fixed since you installed or last updated the package.
|
|
For packages in the tidyverse, the easiest way to check is to run `tidyverse_update()`.
|
|
|
|
2. The easiest way to include **data** is to use `dput()` to generate the R code needed to recreate it.
|
|
For example, to recreate the `mtcars` dataset in R, perform the following steps:
|
|
|
|
1. Run `dput(mtcars)` in R
|
|
2. Copy the output
|
|
3. In reprex, type `mtcars <-` then paste.
|
|
|
|
Try and find the smallest subset of your data that still reveals the problem.
|
|
|
|
3. Spend a little bit of time ensuring that your **code** is easy for others to read:
|
|
|
|
- Make sure you've used spaces and your variable names are concise, yet informative.
|
|
|
|
- Use comments to indicate where your problem lies.
|
|
|
|
- Do your best to remove everything that is not related to the problem.
|
|
|
|
The shorter your code is, the easier it is to understand, and the easier it is to fix.
|
|
|
|
Finish by checking that you have actually made a reproducible example by starting a fresh R session and copying and pasting your script in.
|
|
|
|
## Investing in yourself
|
|
|
|
You should also spend some time preparing yourself to solve problems before they occur.
|
|
Investing a little time in learning R each day will pay off handsomely in the long run.
|
|
One way is to follow what the tidyverse team is doing on the [tidyverse blog](https://www.tidyverse.org/blog/).
|
|
To keep up with the R community more broadly, we recommend reading [R Weekly](https://rweekly.org): it's a community effort to aggregate the most interesting news in the R community each week.
|
|
|
|
If you're an active Twitter user, you might also want to follow Hadley ([\@hadleywickham](https://twitter.com/hadleywickham)), Mine ([\@minebocek](https://twitter.com/minebocek)), Garrett ([\@statgarrett](https://twitter.com/statgarrett)), or follow [\@rstudiotips](https://twitter.com/rstudiotips) to keep up with new features in the IDE.
|
|
If you want the full fire hose of new developments, you can also read the ([`#rstats`](https://twitter.com/search?q=%23rstats)) hashtag.
|
|
This is one the key tools that Hadley and Mine use to keep up with new developments in the community.
|
|
|
|
## Summary
|
|
|
|
This chapter concludes the Whole Game part of the book.
|
|
You've now seen the most important parts of the data science process: visualization, transformation, tidying and importing.
|
|
Now you've got a holistic view of whole process and we start to get into the the details of small pieces.
|
|
|
|
The next part of the book, Transform, goes into depth into the different types of variables that you might encounter: logical vectors, numbers, strings, factors, and date-times, and covers important related topics like tibbles, regular expression, missing values, and joins.
|
|
There's no need to read these chapters in order; dip in and out as needed for the specific data that you're working with.
|