Full pass through plot communication

2016-08-16 14:21:58 -05:00 · 2016-08-16 14:21:58 -05:00 · e074263ef6
parent 62440d106c
commit e074263ef6
1 changed files with 133 additions and 85 deletions
--- a/communicate-plots.Rmd
+++ b/communicate-plots.Rmd
@ -2,28 +2,22 @@

 ## Introduction

-In [exploratory data analysis], you learned how to use plots as tools for _exploration_. When you made these plots, you knew---even before you looked at them---which variables the plot would display and which datasets the variables would come from. You might have even known what to look for in the completed plots, assuming that you made each plot with a goal in mind. As a result, it was not very important to put a title or a useful set of labels on your plots.
+In [exploratory data analysis], you learned how to use plots as tools for _exploration_. When making plots for exploration, you know---even before you look at them---which variables the plot would display. You made each plot for a purpose, could quickly look at it, and then move on to the next plot. In the course of most analyses you'll produce tens of hundreds of plots, most of which are immediately thrown in the trash.

-The importance of titles and labels changes once you use your plots for _communication_. Your audience will not share your background knowledge. In fact, they may not know anything about your plots except what the plots themselves display. If you want your plots to communicate your findings effectively, you will need to make them as self-explanatory as possible.
-
-Luckily, `ggplot2` provides some features that can help you.
+Now you need to _communicate_ the result of your analysis to others. Your audience will not share your background knowledge and will not be deeply invested in the data. To help these newcomers quickly build up a good mental model of the data you will need to invest considerable effort to make your plots as self-explanatory as possible. In this chapter, you'll learn some of the tools that ggplot2 provides to do so.

 ### Prerequisites

-In this chapter, we'll focus once again on ggplot2.
+In this chapter, we'll focus once again on ggplot2. We'll also use a little dplyr for data manipulation, and a few ggplot2 extension packages, including __ggrepel__ and __viridis__. Rather than loading those extension here we'll refer to their functions explicitly with the `::` notation. That will help make it obvious what functions are built into ggplot2, and what functions come from other packages.

 ```{r}
 library(ggplot2)
 library(dplyr)
 ```

-We'll use a few ggplot2 extension packages, including __ggrepel__ and __viridis__, but rather than loading then here we'll use the `::` form to emphasise where the functions come from.
+## Label

-## Titles
-
-One of the most helpful things you can do to an exploratory graphic into an expository graphic is to add good titles.
-
-You can add a title to any `ggplot2` plot by adding the command `labs()` to your plot call. Set the `title` argument of `labs()` to the character string that you would like to appear as the title of your plot. `ggplot2` will place the title at the top of your plot.
+The easiest place to start when turning an exploratory graphic into an expository graphic is with good labels. You can start with a plot title using `labs()`:

 ```{r}
 ggplot(mpg, aes(displ, hwy)) +
@ -32,7 +26,12 @@ ggplot(mpg, aes(displ, hwy)) +
  labs(title = "Fuel efficiency decreases with engine size")
 ```

-Generally, titles should be written in sentence case, and should describe the main finding in the plot, not just what the plot displays. In ggplot2 2.2.0, which should be available by the time you're reading this book, you can also set `subtitle` and `caption` to add either a subtitle beneath the main title, or a caption at the bottom right of the plot.
+Generally, titles describe the main finding in the plot, not just what plot displays. If you need to add more text, there are two other useful labels that you can use in ggplot2 2.2.0 and above (which should be available by the time you're reading this book):
+
+*   `subtitle` adds additional detail in a smaller font beneath the title.
+
+*   `caption` adds text at the bottom right of the plot, often used to describe 
+    the source of the data.

 ```{r}
 ggplot(mpg, aes(displ, hwy)) +
@ -40,55 +39,52 @@ ggplot(mpg, aes(displ, hwy)) +
  geom_smooth(se = FALSE) + 
  labs(
    title = "Fuel efficiency decreases with engine size",
-    subtitle = "Two seaters don't follow the rule because they are light weight",
+    subtitle = "Two seaters are an exception because of their light weight",
    caption = "Data from fueleconomy.gov"
  )
 ```

-### Axes and legend labels
-
-You can also use `labs()` to replace the axis and legend labels in your plot, which might be a good idea if your data uses ambiguous or abbreviated variable names. To replace either of the axis labels, set the `x` or `y` arguments to a character string. `ggplot2` will replace the associated axis label with your character string. To replace a legend label, set the name of the aesthetic displayed in the legend to the character string that should appear as the title of the legend. For example, the legend in our plot corresponds to the color aesthetic. We can change its title with the command, `labs(color = "New Title")`, or, more usefully:
+You can also use `labs()` to replace the axis and legend titles. It's usually a good idea to replace short variable names with more detailed descriptions, and to include the units. 

 ```{r}
 ggplot(mpg, aes(displ, hwy)) +
  geom_point(aes(colour = class)) +
  geom_smooth(se = FALSE) + 
  labs(
-    title = "Fuel efficiency decreases with engine size",
    x = "Displacement (L)", 
-    y = "Highway mpg",
+    y = "Highway fuel economy (mpg)",
    colour = "Car type"
-  ) 
+  )
 ```

-### Legend layout
+It's possible to use mathematical equations instead of text strings. Just switch `""` out for `quote()` and read about the available options in `?plotmath`:

 ```{r}
-ggplot(mpg, aes(displ, hwy)) +
-  geom_point(aes(colour = class)) +
-  geom_smooth(se = FALSE) + 
-  theme(legend.position = "bottom")
-```
-
-For even finer control, use `guides()` and `guide_legend()` (or `guide_colourbar()`). The following example shows two important settings: controlling the number of rows with `nrow`, and override one of the aesthetics to make the points bigger. This is particularly useful if you hae
-
-```{r}
-ggplot(mpg, aes(displ, hwy)) +
-  geom_point(aes(colour = class)) +
-  geom_smooth(se = FALSE) + 
-  theme(legend.position = "bottom") + 
-  guides(colour = guide_legend(nrow = 1, override.aes = list(size = 4)))
+df <- tibble(
+  x = runif(10),
+  y = runif(10)
+)
+ggplot(df, aes(x, y)) +
+  geom_point() +
+  labs(
+    x = quote(sum(x[i] ^ 2, i == 1, n)), 
+    y = quote(alpha + beta + frac(delta, theta))
+  )
 ```

 ### Exercises

-1.  Low alpha - use `override.aes` to make legend more useful.
+1.  Create one plot that combines the `title`, `subtitle`, `caption`, `x`, `y`,
+    and `colour` labels.
+    
+1.  Take an exploratory graphic that you've created in the last month, and add
+    explanatory titles to make it easier for others to understand.

 ## Annotations

-`labs()` help you better label your plot, but often you will want to label components of the data too. The first tool you have at your disposal is `geom_text()`. `geom_text()` is similar to `geom_point()`, but it has an additional aesthetic: `label`. This makes it possible to add textual labels to your plots.
+As well as labelling major components of your plot, it's often useful to label individual observations or groups of observations. The first tool you have at your disposal is `geom_text()`. `geom_text()` is similar to `geom_point()`, but it has an additional aesthetic: `label`. This makes it possible to add textual labels to your plots.

-There are two possible sources of labels. First, you might have a data set that you want to label. The plot below isn't terribly useful, but I first pull out the most efficient car in each class using a little dplyr, and then add it to the plot.
+There are two possible sources of labels. First, you might have a tibble that provides label. The plot below isn't terribly useful, but it illustrates a useful approach: pull out the most efficient car in each class with dplyr, and then label it on the plot:

 ```{r}
 best_in_class <- mpg %>% 
@ -100,7 +96,7 @@ ggplot(mpg, aes(displ, hwy)) +
  geom_text(aes(label = model), data = best_in_class)
 ```

-This plot illustrates some common problems when labelling text: it's hard to read the labels because they overlap on top of the points. We can make things a little easier by switching to `geom_label()` which draws a rectangle behind the text.  We also use the `nudge_y` parameter to move the labels slightly about the corresponding points:
+This is hard to read because the labels overlap with each other, and with the points. We can make things a little better by switching to `geom_label()` which draws a rectangle behind the text. We also use the `nudge_y` parameter to move the labels slightly above the corresponding points:

 ```{r}
 ggplot(mpg, aes(displ, hwy)) + 
@ -108,9 +104,7 @@ ggplot(mpg, aes(displ, hwy)) +
  geom_label(aes(label = model), data = best_in_class, nudge_y = 2, alpha = 0.5)
 ```

-That helps a bit, but if you look closely in the top-left hand corner, you'll notice that there are two labels practically on top of each other. There's no way that we can fix these by applying the same transformation for every label.
-
-Instead, we can use the __ggrepel__ package by Kamil Slowikowski. This useful package will automatically adjust labels so that they don't overlap:
+That helps a bit, but if you look closely in the top-left hand corner, you'll notice that there are two labels practically on top of each other. There's no way that we can fix these by applying the same transformation for every label. Instead, we can use the __ggrepel__ package by Kamil Slowikowski. This useful package will automatically adjust labels so that they don't overlap:

 ```{r}
 ggplot(mpg, aes(displ, hwy)) + 
@ -118,7 +112,7 @@ ggplot(mpg, aes(displ, hwy)) +
  ggrepel::geom_label_repel(aes(label = model), data = best_in_class)
 ```

-You can sometimes use the same idea to replace the legend with labels directly on the same graph. I'm not sure it's terribly effective here, but it isn't too bad. (We'll turn out `legend.position = "none"` very shortly).
+You can sometimes use the same idea to replace the legend with labels placed directly on the plot. It's not wonderful for this plot, but it isn't too bad. (`theme(legend.position = "none")` turns the legend off --- we'll talk about it more shortly).

 ```{r}
 class_avg <- mpg %>% 
@ -139,7 +133,7 @@ ggplot(mpg, aes(displ, hwy, colour = class)) +
  theme(legend.position = "none")
 ```

-If you want to add a single label, you'll still need to create a data frame. Often you want to the label in the corner of the plot, so it's convenient to create a new data frame using `summarise()`. (If you want to add it at an arbitrary location just use `tibble()` to create the data frame.)
+Alternatively, you might just want to add a single label to the plot, but you'll still need to create a data frame. Often you want to the label in the corner of the plot, so it's convenient to create a new data frame using `summarise()`.

 ```{r}
 label <- mpg %>% 
@ -155,7 +149,7 @@ ggplot(mpg, aes(displ, hwy)) +
  geom_text(aes(label = label), data = label, vjust = "top", hjust = "right")
 ```

-If you want to place the text in the absolute top-right corner, you can use infinite positions. In ggplot2, the convention is for these values to be the outside-most positions. Here I use `tibble()`, but if I was going to add multiple labels, I'd use `tribble()` to make the data easier to line up across rows.
+If you want to place the text exactly on the borders of the plot, you can use `+Inf` and `-Inf`. Since I'm no longer computing the positions from `mpg`, I use `tibble()` to create the data frame:

 ```{r}
 label <- tibble(
@ -169,8 +163,7 @@ ggplot(mpg, aes(displ, hwy)) +
  geom_text(aes(label = label), data = label, vjust = "top", hjust = "right")
 ```

-
-Here I manually broke the label up in lines using `"\n"`. Alternatively, you could use `stringr::str_wrap()` to automatically wrap it, given the number of characters you want per line:
+I manually broke the label up into lines using `"\n"`. Another approach is to use  `stringr::str_wrap()` to automatically add linebreaks, given the number of characters you want per line:

 ```{r}
 "Increasing engine size is related to decreasing fuel economy." %>% 
@ -178,7 +171,7 @@ Here I manually broke the label up in lines using `"\n"`. Alternatively, you cou
  writeLines()
 ```

-Note the use of `hjust` and `vjust` to control the the alignment of the label. \@ref(fig:just) shows all nine possible combinations.
+Also note the use of `hjust` and `vjust` to control the the alignment of the label. \@ref(fig:just) shows all nine possible combinations.

 ```{r just, echo = FALSE, fig.cap = "All nine combinations of `hjust` and `vjust`."}
 vjust <- c(bottom = 0, center = 0.5, top = 1)
@ -191,8 +184,9 @@ df <- tidyr::crossing(hj = names(hjust), vj = names(vjust)) %>%
    label = paste0("hjust = '", hj, "'\n", "vjust = '", vj, "'")
  )

-ggplot(df, aes(x, y)) + 
-  geom_point(colour = "grey60", size = 5) + 
+ggplot(df, aes(x, y)) +
+  geom_point(colour = "grey70", size = 5) + 
+  geom_point(size = 0.5, colour = "red") + 
  geom_text(aes(label = label, hjust = hj, vjust = vj), size = 4)
 ```

@ -200,14 +194,14 @@ Remember, as well as `geom_text()` you have all the other geoms in ggplot2 avail

 *   Use `geom_hline()` and `geom_vline()` to add reference lines. I often make
    them thick (`size = 2`) and white (`colour = white`) and draw them 
-    underneath the primary data layer. That makes them easy to see without 
-    drawing too much attention.
+    underneath the primary data layer. That makes them easy to see, but they 
+    don't draw attention away from the data.
    
-*   Use `geom_rect()` to draw are rectangle around points of interesent. The
+*   Use `geom_rect()` to draw a rectangle around points of interest. The
    boundaries of the rectangle are defined by aesthetics `xmin`, `xmax`,
    `ymin`, `ymax`.
    
-*   Use `geom_segment()` with optional `arrow` argument to draw attention
+*   Use `geom_segment()` with the `arrow` argument to draw attention
    to a point with a arrow. Use aesthetics `x` and `y` to define the 
    starting location, and `xend` and `yend` to define the end location.

@ -215,23 +209,32 @@ The only limitation is your imagination! (and your patience at position annotati

 ### Exercises

+1.  Use `geom_text()` with infinite positions to place text at each corner
+    of the plot.
+    
 1.  Read the documentation for `annotate()`. How can you use it to add a text
    label to a plot without having to create a tibble?
+
+1.  How do labels with `geom_text()` interract with facetting? How can you
+    add a label to a single facet? How can you put a different label in
+    each facet? (Hint: think about the underlying data.)
    
 1.  What arguments to `geom_label()` control the appearance of the background
    box?

+1.  What are the four argument to `arrow()`? How do they work? Create a series
+    of plot that demonstrate the most important options.
+
 ## Scales

-The third way you can make your plot better for communication is to adjust the scales. Scales control the mapping from data values to things that you can perceive. 
-Normally, ggplot2 automatically adds scales for you. That means behind the scenes when you type:
+The third way you can make your plot better for communication is to adjust the scales. Scales control the mapping from data values to things that you can perceive. Normally, ggplot2 automatically adds scales for you. When you type:

 ```{r default-scales, fig.show = "hide"}
 ggplot(mpg, aes(displ, hwy)) + 
  geom_point(aes(colour = class))
 ```

-ggplot2 automatically fills in the default scales for you:
+Behind the scenes, ggplot2 automatically adds default scales:

 ```{r, fig.show = "hide"}
 ggplot(mpg, aes(displ, hwy)) + 
@ -241,22 +244,21 @@ ggplot(mpg, aes(displ, hwy)) +
  scale_colour_discrete()
 ```

-You need to know this for two reasons:
+Note the naming scheme for scales: `scale_` followed by the name of the aesthetic, then `_`, then the name of the scale. The default scales are named according to the type of variable they with: continuous, discrete, datetime, or date. There are lots of non-default scales which you'll learn about below.
+
+The default scales have been carefully chosen to do a good job for a wide range of inputs. But you might want to override the defaults for two reasons:

 *   You might want to tweak some of the parameters of the default scale. 
-    This allows you to do things like change the breaks on the legend.
+    This allows you to do things like change the breaks on the axes, or the 
+    key labels on the legend.
    
-*   You might want to replace the scale altogether. The defaults have been 
-    tuned to be widely useful, but often you can do even better with a little 
-    hand tuning.
-
-Note the naming scheme for scales: `scale_` followed by the name of the aesthetic, then `_`, then the name of the scale. The default scales are named according to the type of variable they with: continuous, discrete, datetime, or date. There are lots of non-default scales which you'll learn about below.
+*   You might want to replace the scale altogether, and use a completely 
+    different algorithm. Often you can beat the default because you know
+    more about the data.

 ### Axis ticks and legend keys

-There are two primary arguments that affect the appearance of the ticks on the axes and the keys on the legend: `breaks` and `labels`. Breaks controls the position of the ticks, or the values associated with the keys. Labels controls the text label associated with each tick/key.
-
-The most common use of `breaks` is to add extra breaks (or remove) if the defaults aren't great.
+There are two primary arguments that affect the appearance of the ticks on the axes and the keys on the legend: `breaks` and `labels`. Breaks controls the position of the ticks, or the values associated with the keys. Labels controls the text label associated with each tick/key. The most common use of `breaks` is to override the defaults choice:

 ```{r}
 ggplot(mpg, aes(displ, hwy)) + 
@ -264,7 +266,7 @@ ggplot(mpg, aes(displ, hwy)) +
  scale_y_continuous(breaks = seq(15, 40, by = 5))
 ```

-`labels` should be a character vector the same length as `breaks`. It can also be `NULL` if you'd like to suppress the numbers altogether. This is useful for maps, or when you want to publish semi-public data with out lables.
+You can use `labels` in the same way (a character vector the same length as `breaks`), but you can also set it to `NULL` to suppress the labels altogether. This is useful for maps, or for publishing plots where you can't share the absolute numbers.

 ```{r}
 ggplot(mpg, aes(displ, hwy)) + 
@ -273,7 +275,9 @@ ggplot(mpg, aes(displ, hwy)) +
  scale_y_continuous(labels = NULL)
 ```

-Another use of `breaks` is when you have relatively few data points and want to highlight exactly where the observations occur. For example, take this plot that shows when each US presidient started and ended their term.
+You can also use `breaks` and `labels` control the apperance of legends. Collecting axes and legends are called guides. Axes are used for x and y aesthetics; legends are used used for everything else.
+
+Another use of `breaks` is when you have relatively few data points and want to highlight exactly where the observations occur. For example, take this plot that shows when each US president started and ended their term.

 ```{r}
 presidential %>% 
@ -284,16 +288,44 @@ presidential %>%
    scale_x_date(NULL, breaks = presidential$start, date_labels = "'%y")
 ```

-Note that the specification of breaks and labels for date and datetime scales are a little different:
+Note that the specification of breaks and labels for date and datetime scales is a little different:

 * `date_labels` takes a format specification, in the same form as 
  `parse_datetime()`.
  
 * `date_breaks` (not shown here), takes a string like "2 days" or "1 month".

+### Legend layout
+
+You most often use `breaks` and `labels` to tweak the axes. While they both also work for legends, there are a few other techniques your more likely to use. 
+
+To control the overall position of the legend, you need to use a `theme()` setting. We'll come back to themes at the end of the chapter, but in brief, they control the non-data parts of the plot. The themes setting `legend.position` controls where the legend is drawn:
+
+```{r fig.asp = 1, fig.align = "default", out.width = "50%", fig.width = 3}
+base <- ggplot(mpg, aes(displ, hwy)) +
+  geom_point(aes(colour = class))
+  
+base + theme(legend.position = "left") # the default
+base + theme(legend.position = "top")
+base + theme(legend.position = "bottom")
+base + theme(legend.position = "right")
+```
+
+You can also use `legend.postion = "none"` to suppress the display of the legend altogether.
+
+To control the display of individual legneds, use `guides()` along with `guide_legend()` or `guide_colourbar()`. The following example shows two important settings: controlling the number of rows with `nrow`, and overriding one of the aesthetics to make the points bigger. This is particularly useful if you have used a low `alpha` to display many points on a plot.
+
+```{r}
+ggplot(mpg, aes(displ, hwy)) +
+  geom_point(aes(colour = class)) +
+  geom_smooth(se = FALSE) + 
+  theme(legend.position = "bottom") + 
+  guides(colour = guide_legend(nrow = 1, override.aes = list(size = 4)))
+```
+
 ### Replacing a scale

-We'll focus on colour scales because those are most likely. All of these scales have two variants `scale_colour_x()` and `scale_fill_x()` for the `colour` and `fill` aesthetics respectically. (And the colour scales are available in both UK and US spellings.)
+Instead of just tweaking the detail a little, you can also replace the scale altogether. We'll focus on colour scales because there are many options, and they're the scales you're mostly likely to want to change. The same principles apply to the other aesthetics. All colour scales have two variants `scale_colour_x()` and `scale_fill_x()` for the `colour` and `fill` aesthetics respectically. (And the colour scales are available in both UK and US spellings.)

 The default categorical scale picks colours that are evenly spaced around the colour wheel. A useful alternative are the ColourBrewer scales which have been hand tuned to work better for people with common types of colour blindness. The two plots below don't look that different, but there's enough difference in the shades of red and green that they can be distinguished even by people with red-green colour blindness.

@ -345,23 +377,41 @@ ggplot(df, aes(x, y)) +

 ### Exercises

-1.  Example where you set colour scale instead of fill. Why doesn't it work?
+1.  Why doesn't the following code override the default scale?

-1.  What is first argument to every scale? How is it different to `labs()`?
+    ```{r fig.show = "hide"}
+    ggplot(df, aes(x, y)) +
+      geom_hex() + 
+      scale_colour_gradient(low = "white", high = "red") + 
+      coord_fixed() 
+    ```

-1.  Improve the display of the presidential terms by:
+1.  What is first argument to every scale? How does it compare to to `labs()`?

-    1. Enhancing the display of the y axis.
+1.  Change the display of the presidential terms by:
+
+    1. Combining the two variants shown above.
+    1. Improve the display of the y axis.
    1. Labelling each term with the name of the President.
+    1. Adding informative plot labels.
+    1. Placing breaks every 4 years (this is trickier than it seems!).
+
+1.  Use `override.aes` to make the legend on the following plot more useful.
+
+    ```{r, dev = "png"}
+    ggplot(diamonds, aes(carat, price)) + 
+      geom_point(aes(colour = cut), alpha = 1/20) 
+    ```

 ## Zooming

-There are three ways to control the limits of the axes:
+There are three ways to control the plot limits:

-1. By controlling the data
-1. By setting `xlim` and `ylim` in `coord_cartesian()`.
+1. By controlling the data.
+1. Setting the limits in each scale.
+1. Setting `xlim` and `ylim` in `coord_cartesian()`.

-Often, it can be helpful to zoom in on a specific region of your plot. In `ggplot2` you can do this by adding `coord_cartesian()` to your plot and setting it's `xlim` and `ylim` arguments. Pass each argument a vector of two numbers, the minimum value to display on that axis and the maximum value, e.g.
+To zoom in on a region of the plot, it's generally best to use `coord_cartesian()`. Compare the following two plots:

 ```{r out.width = "50%", fig.align = "default"}
 ggplot(mpg, mapping = aes(displ, hwy)) +
@ -377,9 +427,7 @@ mpg %>%
  coord_cartesian(xlim = c(5, 7), ylim = c(10, 30)) 
 ```

-`coord_cartesian()` adds a cartesian coordinate system to your plot (which is the default coordinate system). However, the new coordinate system will use the zoomed in limits. 
-
-There is one other way: you can also set the `limits` in the scale. If you are reducing the limits, this is basically equivalent to subsetting the data. It's more useful if you want _expand_ the limits. This is particularly useful if you want to make sure that scales match across multiple plots. Take the following toy example: if we extract out two classes of car and plot them separately, it's hard to compare the plots because all three scales have different ranges.
+You can also setting the `limits` on individual scales. If you are reducing the limits, this is basically equivalent to subsetting the data. It's more useful if you want _expand_ the limits. This is useful if you want to match scales across different plots. Take the following toy example: if we extract out two classes of car and plot them separately, it's hard to compare the plots because all three scales have different ranges.

 ```{r out.width = "50%", fig.align = "default", fig.width = 4}
 suv <- mpg %>% filter(class == "suv")
@ -416,7 +464,7 @@ In this case you could have used facetting, but this technique is broadly useful

 ## Themes

-Finally, you can also quickly customize the non-data elements of your plot with a theme:
+Finally, you can customize the non-data elements of your plot with a theme:

 ```{r}
 ggplot(mpg, aes(displ, hwy)) +
@ -431,13 +479,13 @@ ggplot2 includes eight themes by default, as shown in Figure \@ref(fig:themes).
 knitr::include_graphics("images/visualization-themes.png")
 ```

-Many people wonder why the default theme as grey background. This was a deliberate choice to put the data forward while supporting comparisons, following the advice of Edward Tufte, Cynthia Brewer, and Dan Carr. We can still see the gridlines, which are important aid to the judgement of position,  but they have little visual impact and we can easily 'tune' them out. The grey background gives the plot a similar typographic colour to the text, ensuring that the graphics fit in with the flow of a  document without jumping out with a bright white background. Finally, the grey background creates a continuous field of colour which ensures that the plot is perceived as a single visual entity.
+Many people wonder why the default theme has a grey background. This was a deliberate choice because it puts the data forward while still making the grid lines visible. The white grid lines are still visible (which is important because they significantly aid position judgements), but they have little visual impact and we can easily tune them out. The grey background gives the plot a similar typographic colour to the text, ensuring that the graphics fit in with the flow of a document without jumping out with a bright white background. Finally, the grey background creates a continuous field of colour which ensures that the plot is perceived as a single visual entity.

-It's also possible to control individual components of each theme, like the size and colour of the font used for the y axis. This unfortunately is outside the scope of this book, so you'll need to ggplot2 book for the full details. You can also create your themes if you have a corporate style or you're trying to match a specific journal.
+It's also possible to control individual components of each theme, like the size and colour of the font used for the y axis. This unfortunately is outside the scope of this book, so you'll need to read the ggplot2 book for the full details. You can also create your own themes if you have a corporate style or you're trying to match a journal style.

 ## Learning more

-The absolute best place to learn more is the ggplot2 book: [_ggplot2: Elegant graphics for data analysis_](https://amzn.com/331924275X). Unfortunately it is not available online for free, but you can find the source code for the book at <https://github.com/hadley/ggplot2-book>.
+The absolute best place to learn more is the ggplot2 book: [_ggplot2: Elegant graphics for data analysis_](https://amzn.com/331924275X). It goes into much more depth about the underlying theory, and has many more examples of how to combine the individual pieces to solve practical problems. Unfortunately the book is not available online for free, although can find the source code at <https://github.com/hadley/ggplot2-book>.

 Another great resource is the ggplot2 extensions guide at  <http://www.ggplot2-exts.org/>. This lists many of the packages that extend ggplot2 with new geoms and scales. It's a great place to start if you're trying to do something that seems really hard with ggplot2.