From fc8ea9d06033a301909b9756b7d33619aa024fe7 Mon Sep 17 00:00:00 2001 From: Jonathan Page Date: Tue, 16 Aug 2016 02:22:41 -1000 Subject: [PATCH] Grammar (#271) --- EDA.Rmd | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/EDA.Rmd b/EDA.Rmd index 67b659c..0961781 100644 --- a/EDA.Rmd +++ b/EDA.Rmd @@ -83,7 +83,7 @@ Every variable has its own pattern of variation, which can reveal interesting in ### Visualising distributions -How you visualise the distribution of a variable will depend on whether the variable is categorical or continuous. A variable is **categorical** if it can only take one of small set of values. In R, categorical variables are usually saved as factors or character vectors. To examine the distribution of a categorical variable, use a bar chart: +How you visualise the distribution of a variable will depend on whether the variable is categorical or continuous. A variable is **categorical** if it can only take one of a small set of values. In R, categorical variables are usually saved as factors or character vectors. To examine the distribution of a categorical variable, use a bar chart: ```{r} ggplot(data = diamonds) + @@ -130,7 +130,7 @@ ggplot(data = smaller, mapping = aes(x = carat, colour = cut)) + geom_freqpoly(binwidth = 0.1) ``` -There are a few challenges with this type of plot, which we will come back to in [visualisation a categorical and a continuous variable](#cat-cont). +There are a few challenges with this type of plot, which we will come back to in [visualising a categorical and a continuous variable](#cat-cont). Now that you can visualise variation, what should you look for in your plots? And what type of follow-up questions should you ask? I've put together a list below of the most useful types of information that you will find in your graphs, along with some follow up questions for each type of information. The key to asking good follow up questions will be to rely on your **curiosity** (What do you want to learn more about?) as well as your **skepticism** (How could this be misleading?). @@ -582,7 +582,7 @@ ggplot(faithful, aes(eruptions)) + geom_freqpoly(binwidth = 0.25) ``` -Sometimes we'll turn the end of pipeline of data transformation into a plot. Watch for the transition from `%>%` to `+`. I wish this transition wasn't necessary but unfortunately ggplot2 was created before the pipe was discovered. +Sometimes we'll turn the end of a pipeline of data transformation into a plot. Watch for the transition from `%>%` to `+`. I wish this transition wasn't necessary but unfortunately ggplot2 was created before the pipe was discovered. ```{r, eval = FALSE} diamonds %>% @@ -591,4 +591,4 @@ diamonds %>% geom_tile() ``` -If you want learn more about ggplot2, I'd highly recommend grabbing a copy of the ggplot2 book: . It's been recently updated, so includes dplyr and tidyr code, and has much more space to explore all the facets of visualisation. Unfortunately the book isn't generally available for free, but if you have a connection to a university you can probably get an electronic version for free through SpringerLink. +If you want learn more about ggplot2, I'd highly recommend grabbing a copy of the ggplot2 book: . It's been recently updated, so it includes dplyr and tidyr code, and has much more space to explore all the facets of visualisation. Unfortunately the book isn't generally available for free, but if you have a connection to a university you can probably get an electronic version for free through SpringerLink.