Merge branch 'master' of github.com:hadley/r4ds

# Conflicts:
#	EDA.Rmd
This commit is contained in:
hadley 2016-08-16 08:44:24 -05:00
commit ddfe63964d
2 changed files with 6 additions and 6 deletions

View File

@ -83,7 +83,7 @@ Every variable has its own pattern of variation, which can reveal interesting in
### Visualising distributions
How you visualise the distribution of a variable will depend on whether the variable is categorical or continuous. A variable is **categorical** if it can only take one of small set of values. In R, categorical variables are usually saved as factors or character vectors. To examine the distribution of a categorical variable, use a bar chart:
How you visualise the distribution of a variable will depend on whether the variable is categorical or continuous. A variable is **categorical** if it can only take one of a small set of values. In R, categorical variables are usually saved as factors or character vectors. To examine the distribution of a categorical variable, use a bar chart:
```{r}
ggplot(data = diamonds) +
@ -130,7 +130,7 @@ ggplot(data = smaller, mapping = aes(x = carat, colour = cut)) +
geom_freqpoly(binwidth = 0.1)
```
There are a few challenges with this type of plot, which we will come back to in [visualisation a categorical and a continuous variable](#cat-cont).
There are a few challenges with this type of plot, which we will come back to in [visualising a categorical and a continuous variable](#cat-cont).
Now that you can visualise variation, what should you look for in your plots? And what type of follow-up questions should you ask? I've put together a list below of the most useful types of information that you will find in your graphs, along with some follow up questions for each type of information. The key to asking good follow up questions will be to rely on your **curiosity** (What do you want to learn more about?) as well as your **skepticism** (How could this be misleading?).
@ -582,7 +582,7 @@ ggplot(faithful, aes(eruptions)) +
geom_freqpoly(binwidth = 0.25)
```
Sometimes we'll turn the end of pipeline of data transformation into a plot. Watch for the transition from `%>%` to `+`. I wish this transition wasn't necessary but unfortunately ggplot2 was created before the pipe was discovered.
Sometimes we'll turn the end of a pipeline of data transformation into a plot. Watch for the transition from `%>%` to `+`. I wish this transition wasn't necessary but unfortunately ggplot2 was created before the pipe was discovered.
```{r, eval = FALSE}
diamonds %>%
@ -591,6 +591,6 @@ diamonds %>%
geom_tile()
```
If you want learn more about ggplot2, I'd highly recommend grabbing a copy of the ggplot2 book: <https://amzn.com/331924275X>. It's been recently updated, so includes dplyr and tidyr code, and has much more space to explore all the facets of visualisation. Unfortunately the book isn't generally available for free, but if you have a connection to a university you can probably get an electronic version for free through SpringerLink.
If you want learn more about ggplot2, I'd highly recommend grabbing a copy of the ggplot2 book: <https://amzn.com/331924275X>. It's been recently updated, so it includes dplyr and tidyr code, and has much more space to explore all the facets of visualisation. Unfortunately the book isn't generally available for free, but if you have a connection to a university you can probably get an electronic version for free through SpringerLink.
Another useful resource is the [_R Graphics Cookbook_](https://amzn.com/1449316956) by Winston Chang. Much of the contents are available online at <http://www.cookbook-r.com/Graphs/>.

View File

@ -2,7 +2,7 @@
## Introduction
Visualisation is an important tool for insight generation, but it is rare that you get the data in exactly the right form you need. Often you'll need to create some new variables or summaries, or maybe you just want to rename the variables or reorder the observations in order to make the data a little easier to work with. You'll learn how to do all that (and more!) in this chapter which will teach you how to transform your data using the dplyr package and a new dataset on flights departing New York City in 2013.
Visualisation is an important tool for insight generation, but it is rare that you get the data in exactly the right form you need. Often you'll need to create some new variables or summaries, or maybe you just want to rename the variables or reorder the observations in order to make the data a little easier to work with. You'll learn how to do all that (and more!) in this chapter, which will teach you how to transform your data using the dplyr package and a new dataset on flights departing New York City in 2013.
### Prerequisites
@ -35,7 +35,7 @@ You might also have noticed the row of three letter abbreviations under the colu
### Dplyr basics
In this chapter you are going to learn the five key dplyr functions that allow you to solve vast majority of your data manipulation challenges:
In this chapter you are going to learn the five key dplyr functions that allow you to solve the vast majority of your data manipulation challenges:
* Pick observations by their values (`filter()`).
* Reorder the rows (`arrange()`).