Fix typos (#989)

This commit is contained in:
Jakob Krigovsky 2022-01-05 03:07:35 +01:00 committed by GitHub
parent 7bc19dc36a
commit 011f8cceee
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
7 changed files with 17 additions and 17 deletions

View File

@ -569,7 +569,7 @@ ggplot(compact, aes(displ, hwy, colour = drv)) +
col_scale
```
In this particular case, you could have simply used faceting, but this technique is useful more generally, if for instance, you want spread plots over multiple pages of a report.
In this particular case, you could have simply used faceting, but this technique is useful more generally, if for instance, you want to spread plots over multiple pages of a report.
## Themes
@ -635,7 +635,7 @@ I only ever use three of the five options:
- I control the output size with `out.width` and set it to a percentage of the line width.
I default to `out.width = "70%"` and `fig.align = "center"`.
That give plots room to breathe, without taking up too much space.
That gives plots room to breathe, without taking up too much space.
- To put multiple plots in a single row I set the `out.width` to `50%` for two plots, `33%` for 3 plots, or `25%` to 4 plots, and set `fig.align = "default"`.
Depending on what I'm trying to illustrate (e.g. show data or show plot variations), I'll also tweak `fig.width`, as discussed below.

View File

@ -6,7 +6,7 @@ The goal of a model is to provide a simple low-dimensional summary of a dataset.
In the context of this book we're going to use models to partition data into patterns and residuals.
Strong patterns will hide subtler trends, so we'll use models to help peel back layers of structure as we explore a dataset.
However, before we can start using models on interesting, real, datasets, you need to understand the basics of how models work.
However, before we can start using models on interesting, real datasets, you need to understand the basics of how models work.
For that reason, this chapter of the book is unique because it uses only simulated datasets.
These datasets are very simple, and not at all interesting, but they will help you understand the essence of modelling before you apply the same techniques to real data in the next chapter.
@ -116,7 +116,7 @@ model1(c(7, 1.5), sim1)
Next, we need some way to compute an overall distance between the predicted and actual values.
In other words, the plot above shows 30 distances: how do we collapse that into a single number?
One common way to do this in statistics to use the "root-mean-squared deviation".
One common way to do this in statistics is to use the "root-mean-squared deviation".
We compute the difference between actual and predicted, square them, average them, and then take the square root.
This distance has lots of appealing mathematical properties, which we're not going to talk about here.
You'll just have to take my word for it!
@ -316,7 +316,7 @@ ggplot(sim1, aes(x)) +
### Residuals
The flip-side of predictions are **residuals**.
The predictions tells you the pattern that the model has captured, and the residuals tell you what the model has missed.
The predictions tell you the pattern that the model has captured, and the residuals tell you what the model has missed.
The residuals are just the distances between the observed and predicted values that we computed above.
We add residuals to the data with `add_residuals()`, which works much like `add_predictions()`.

View File

@ -481,7 +481,7 @@ df <- enframe(x)
df
```
The advantage of this structure is that it generalises in a straightforward way - names are useful if you have character vector of metadata, but don't help if you have other types of data, or multiple vectors.
The advantage of this structure is that it generalises in a straightforward way - names are useful if you have a character vector of metadata but don't help if you have other types of data, or multiple vectors.
Now if you want to iterate over names and values in parallel, you can use `map2()`:
@ -510,7 +510,7 @@ df %>%
```
4. What does this code do?
Why might might it be useful?
Why might it be useful?
```{r, eval = FALSE}
mtcars %>%

View File

@ -49,10 +49,10 @@ There is a pair of ideas that you must understand in order to do inference corre
As soon as you use an observation twice, you've switched from confirmation to exploration.
This is necessary because to confirm a hypothesis you must use data independent of the data that you used to generate the hypothesis.
Otherwise you will be over optimistic.
Otherwise you will be over-optimistic.
There is absolutely nothing wrong with exploration, but you should never sell an exploratory analysis as a confirmatory analysis because it is fundamentally misleading.
If you are serious about doing an confirmatory analysis, one approach is to split your data into three pieces before you begin the analysis:
If you are serious about doing a confirmatory analysis, one approach is to split your data into three pieces before you begin the analysis:
1. 60% of your data goes into a **training** (or exploration) set.
You're allowed to do anything you like with this data: visualise it and fit tons of models to it.

View File

@ -354,7 +354,7 @@ cat(".\n")
An online version of this book is available at <http://r4ds.had.co.nz>.
It will continue to evolve in between reprints of the physical book.
The source of the book is available at <https://github.com/hadley/r4ds>.
The book is powered by <https://bookdown.org> which makes it easy to turn R markdown files into HTML, PDF, and EPUB.
The book is powered by <https://bookdown.org> which makes it easy to turn R Markdown files into HTML, PDF, and EPUB.
This book was built with:

View File

@ -482,7 +482,7 @@ It also makes it easier to understand your solutions to old problems when you re
1. Read the documentation for `apply()`.
In the 2d case, what two for loops does it generalise?
2. Adapt `col_summary()` so that it only applies to numeric columns You might want to start with an `is_numeric()` function that returns a logical vector that has a TRUE corresponding to each numeric column.
2. Adapt `col_summary()` so that it only applies to numeric columns You might want to start with an `is_numeric()` function that returns a logical vector that has a `TRUE` corresponding to each numeric column.
## The map functions

View File

@ -2,7 +2,7 @@
## Introduction
R Markdown provides an unified authoring framework for data science, combining your code, its results, and your prose commentary.
R Markdown provides a unified authoring framework for data science, combining your code, its results, and your prose commentary.
R Markdown documents are fully reproducible and support dozens of output formats, like PDFs, Word files, slideshows, and more.
R Markdown files are designed to be used in three ways:
@ -65,7 +65,7 @@ knitr::include_graphics("rmarkdown/diamond-sizes-report.png")
When you **knit** the document, R Markdown sends the .Rmd file to **knitr**, <http://yihui.name/knitr/>, which executes all of the code chunks and creates a new markdown (.md) document which includes the code and its output.
The markdown file generated by knitr is then processed by **pandoc**, <http://pandoc.org/>, which is responsible for creating the finished file.
The advantage of this two step workflow is that you can create a very wide range of output formats, as you'll learn about in [R markdown formats].
The advantage of this two step workflow is that you can create a very wide range of output formats, as you'll learn about in [R Markdown formats].
```{r, echo = FALSE, out.width = "75%"}
knitr::include_graphics("images/RMarkdownFlow.png")
@ -87,7 +87,7 @@ The following sections dive into the three components of an R Markdown document
Knit it by using the appropriate keyboard short cut.
Verify that you can modify the input and see the output update.
3. Compare and contrast the R notebook and R markdown files you created above.
3. Compare and contrast the R notebook and R Markdown files you created above.
How are the outputs similar?
How are they different?
How are the inputs similar?
@ -128,7 +128,7 @@ If you forget, you can get to a handy reference sheet with *Help \> Markdown Qui
b. Add a horizontal rule.
c. Add a block quote.
3. Copy and paste the contents of `diamond-sizes.Rmd` from <https://github.com/hadley/r4ds/tree/master/rmarkdown> in to a local R markdown document.
3. Copy and paste the contents of `diamond-sizes.Rmd` from <https://github.com/hadley/r4ds/tree/master/rmarkdown> in to a local R Markdown document.
Check that you can run it, then add text after the frequency polygon that describes its most striking features.
## Code chunks
@ -359,13 +359,13 @@ The first thing you should always try is to recreate the problem in an interacti
Restart R, then "Run all chunks" (either from Code menu, under Run region), or with the keyboard shortcut Ctrl + Alt + R.
If you're lucky, that will recreate the problem, and you can figure out what's going on interactively.
If that doesn't help, there must be something different between your interactive environment and the R markdown environment.
If that doesn't help, there must be something different between your interactive environment and the R Markdown environment.
You're going to need to systematically explore the options.
The most common difference is the working directory: the working directory of an R Markdown is the directory in which it lives.
Check the working directory is what you expect by including `getwd()` in a chunk.
Next, brainstorm all the things that might cause the bug.
You'll need to systematically check that they're the same in your R session and your R markdown session.
You'll need to systematically check that they're the same in your R session and your R Markdown session.
The easiest way to do that is to set `error = TRUE` on the chunk causing the problem, then use `print()` and `str()` to check that settings are as you expect.
## YAML header