More work on visualization chapter.

This commit is contained in:
Garrett 2015-11-12 17:44:03 -05:00
parent ce92742614
commit 58773275f9
1 changed files with 159 additions and 153 deletions

View File

@ -12,10 +12,11 @@ knitr::opts_chunk$set(cache = TRUE)
> "The simple graph has brought more information to the data analysts mind than any other device."---John Tukey
Visualization makes data decipherable. Have you ever tried to study a table of raw data? Raw data is difficult to comprehend. You can examine values one at a time, but you cannot attend to many values at once. The data overloads your attention span, which makes it hard to spot patterns in the data. See this for yourself; can you spot the striking relationship between $X$ and $Y$ in the table below?
Visualization makes data decipherable. Have you ever tried to study a table of raw data? You can examine values one at a time, but you cannot attend to many values at once. The data overloads your attention span, which makes it hard to spot patterns in the data. See this for yourself; can you spot the striking relationship between $X$ and $Y$ in the table below?
```{r echo=FALSE}
X <- rep(seq(0.1, 1.9, length = 6), 2) + runif(12, -0.1, 0.1)
```{r data, echo=FALSE}
x <- rep(seq(0.2, 1.8, length = 5), 2) + runif(10, -0.15, 0.15)
X <- c(0.02, x, 1.94)
Y <- sqrt(1 - (X - 1)^2)
Y[1:6] <- -1 * Y[1:6]
Y <- Y - 1
@ -23,9 +24,9 @@ order <- sample(1:10)
knitr::kable(round(data.frame(X = X[order], Y = Y[order]), 2))
```
In contrast, visualized data is easy to understand. Once you plot your data, you can see the relationships between data points---instantly. For example, the graph below shows the same data as above. Here, the relationship between the points is obvious.
Raw data is difficult to comprehend, but visualized data is easy to understand. Once you plot your data, you can see the relationships between data points---instantly. For example, the graph below shows the same data as above. Here, the relationship between the points is obvious.
```{r echo=FALSE}
```{r echo=FALSE, dependson=data}
ggplot2::qplot(X, Y) + ggplot2::coord_fixed(ylim = c(-2.5, 2.5), xlim = c(-2.5, 2.5))
```
@ -93,20 +94,20 @@ ggplot(data = mpg) +
You can immediately see that there is a negative relationship between engine size (`displ`) and fuel efficiency (`hwy`). In other words, cars with big engines have a worse fuel efficiency. But the graph shows us something else as well.
One group of points seems to fall outside the linear trend. These cars have a higher mileage than you might expect. Can you tell why? Before we examine this phenomenon, let's review the code that made our graph.
One group of points seems to fall outside the linear trend. These cars have a higher mileage than you might expect. Can you tell why? Before we examine these cars, let's review the code that made our graph.
`r bookdown::embed_png("images/visualization-1.png", dpi = 150)`
### Template
Our code is almost a template for making plots with `ggplot2`.
Our is almost a template for making plots with `ggplot2`.
```{r eval=FALSE}
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))
```
With `ggplot2`, you begin every plot with the function `ggplot()`. `ggplot()` doesn't create a plot by itself; instead it initializes a new plot that you can add layers to.
With `ggplot2`, you begin a plot with the function `ggplot()`. `ggplot()` doesn't create a plot by itself; instead it initializes a new plot that you can add layers to.
The first argument of `ggplot()` is the data set to use in the graph. So `ggplot(data = mpg)` initializes a graph that will use the `mpg` data set.
@ -146,7 +147,7 @@ ggplot(data = mpg) +
`ggplot2` will automatically assign a unique level of the aesthetic (here a unique color) to each unique value of the variable. `ggplot2` will also add a legend that explains which levels correspond to which values.
The colors reveal that many of the unusual points are two seater cars. These don't sound like hybrids. In fact, they sound like sports cars---and that's what the points are. Sports cars have large engines like suvs and pickup trucks, but small bodies like midsize and compact cars, which improves their gas mileage. In hindsight, these cars were unlikely to be hybrids since they have such large engines.
The colors reveal that many of the unusual points are two seater cars. These cars don't seem like hybrids. In fact, they seem like sports cars---and that's what they are. Sports cars have large engines like suvs and pickup trucks, but small bodies like midsize and compact cars, which improves their gas mileage. In hindsight, these cars were unlikely to be hybrids since they have such large engines.
Color is one of the most popular aesthetics to use in a scatterplot, but we could have mapped `class` to the size aesthetic in the same way. In this case, the exact size of each point reveals its class affiliation.
@ -155,7 +156,7 @@ ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, size = class))
```
Or we could have mapped `class` to the _alpha_, or transparency, of the points. Now the transparency of each point corresponds with its class affiliation.
Or we could have mapped `class` to the _alpha_ (i.e., transparency) of the points. Now the transparency of each point corresponds with its class affiliation.
```{r}
ggplot(data = mpg) +
@ -169,7 +170,7 @@ ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, shape = class))
```
In each case, we set the name of the aesthetic to the variable to display and we do this within the `aes()` function. The syntax highlights a useful insight because we also set `x` and `y` to variables within `aes()`: the x location and the y location of a point are also aesthetics, visual properties that you can map to variables to display information about the data.
In each case, we set the name of the aesthetic to the variable to display and we do this within the `aes()` function. The syntax highlights a useful insight because we also set `x` and `y` to variables within `aes()`: the x and y locations of a point are also aesthetics, visual properties that you can map to variables to display information about the data.
Once you set an aesthetic, `ggplot2` takes care of the rest. It selects a pleasing set of values to use for the aesthetic and it constructs a legend that explains the mapping. For x and y aesthetics, `ggplot2` does not create a legend, but it creates an axis line with tick marks and a label. The axis line acts in the same way as a legend; it explains the mapping between locations and values.
@ -179,7 +180,7 @@ Now that you know how to use aesthetics, take a moment to experiment with the `m
+ Continuous variables in `mpg`: `displ`, `year`, `cyl`, `cty`, `hwy`
+ Discrete variables in `mpg`: `manufacturer`, `model`, `trans`, `drv`, `fl`, `class`
* Attempt to use more than one aesthetic at a time.
* Attempt to set an aesthetic to something other than a variable name, like `hwy / 2`.
* Attempt to set an aesthetic to something other than a variable name, like `displ < 5`.
See the help page for `geom_point()` (`?geom_point`) to learn which aesthetics are available to use in a scatterplot. See the help page for the `mpg` data set (`?mpg`) to learn which variables are in the data set.
@ -189,11 +190,11 @@ Have you experimented with aesthetics? Great! Here are some things that you may
A continuous variable can contain an infinite number of values that can be put in order, like numbers or date-times. If your variable is continuous, `ggplot2` will treat it in a special way. `ggplot2` will
* use a gradient of colors from blue to black for the color aesthetic.
* display a colorbar in the legend for the color aesthetic.
* not use the shape aesthetic.
* use a gradient of colors from blue to black for the color aesthetic
* display a colorbar in the legend for the color aesthetic
* not use the shape aesthetic
`ggplot2` will not use the shape aesthetic to display continuous information. Why? Because the human eye cannot easily interpolate between shapes. Can you tell whether a shape is three-quarters of the way between a triangle and a circle? how about five-eights of the way?
`ggplot2` will not use the shape aesthetic to display continuous information because the human eye cannot easily interpolate between shapes. Can you tell whether a shape is three-quarters of the way between a triangle and a circle? How about five-eights of the way?
`ggplot2` will treat your variable as continuous if it is a numeric, integer, or a recognizable date-time structure (but not a factor, see `?factor`).
@ -239,6 +240,66 @@ ggplot(data = mpg) +
color = displ < 5))
```
#### Setting vs. Mapping
You can also manually set aesthetics to specific levels. For example, you can make all of the points in your plot blue.
```{r echo = FALSE}
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy), color = "blue")
```
To set an aesthetic manually, call the aesthetic name as an argument of your geom function. Then pass the aesthetic a value that R will recognize, such as
* the name of a color as a character string
* the size of a point as a cex expansion factor (see `?par`)
* the shape as a point as a number code
R uses the following numeric codes to refer to the following shapes.
```{r echo=FALSE}
pchShow <-
function(extras = c("*",".", "o","O","0","+","-","|","%","#"),
cex = 2,
col = "red3", bg = "gold", coltext = "brown", cextext = 1.1,
main = "")
{
nex <- length(extras)
np <- 26 + nex
ipch <- 0:(np-1)
k <- floor(sqrt(np))
dd <- c(-1,1)/2
rx <- dd + range(ix <- ipch %/% k)
ry <- dd + range(iy <- 3 + (k-1)- ipch %% k)
pch <- as.list(ipch) # list with integers & strings
if(nex > 0) pch[26+ 1:nex] <- as.list(extras)
plot(rx, ry, type = "n", axes = FALSE, xlab = "", ylab = "", main = main)
abline(v = ix, h = iy, col = "lightgray", lty = "dotted")
for(i in 1:np) {
pc <- pch[[i]]
points(ix[i], iy[i], pch = pc, col = col, bg = bg, cex = cex)
if(cextext > 0)
text(ix[i] - 0.4, iy[i], pc, col = coltext, cex = cextext)
}
}
pchShow()
```
If you try to set an aesthetic from within the mappings argument (i.e. the `aes()` call), you will get an unexpected result, as below.
```{r}
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, color = "blue"))
```
Here, `ggplot2` treats `color = "blue"` as a mapping because it appears in the mapping argument. `ggplot2` assumes that "blue" is a value in the data space. It uses R's recycling rules to pair the single value "blue" with each row of data in `mpg`. Then `ggplot2` creates a mapping from the value "blue" in the data space to the pinkish color that we see in the visual space. `ggplot2` even creates a legend to let you know that the color pink represents the value "blue." The choice of pink is a coincidence; `ggplot2` defaults to pink whenever a single discrete value is mapped to the color aesthetic.
If you experience this type of behavior, remember:
* define an aesthetic _within_ the `aes()` function to map levels of the aesthetic to values of data. You would expect a legend after this operation.
* define an aesthetic _outside of_ the `aes()` function to manually set the aesthetic to a specific level. You would not expect a legend after this operation.
### Facets
Facets provide a second way to add a variables to a two dimensional graph. When you facet a graph, you divide your data into subgroups and then plot a separate graph, or _facet_, for each subgroup.
@ -525,59 +586,92 @@ ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut))
```
On the x axis it displays `cut`, a variable in the `diamonds` data set. On the y axis, it displays count. But where does count come from? Count is not a variable in the diamonds data set:
On the x axis it displays `cut`, a variable in the `diamonds` data set. On the y axis, it displays count. But count is not a variable in the diamonds data set:
```{r}
head(diamonds)
```
And we didn't tell `ggplot2` in our plot call where to find count values.
Nor did we tell `ggplot2` in our code where to find count values.
```{r eval = FALSE}
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut))
```
What is going on here?
Where does count come from?
Some plots, like scatterplots, plot the raw values in your data set. Other types of graphs, like bar charts and smooth lines, do not plot raw values at all. These graphs apply an algorithm to the data and then plot the results of the algorithm. Consider how many graphs do this.
Some graphs, like scatterplots, plot the raw values of your data set. Other graphs, like bar charts and smooth lines, do not plot raw values at all. These graphs apply an algorithm to your data and then plot the results of the algorithm. Consider how many graphs do this.
* **bar charts** and **histograms** bin the data and then plot bin counts
* **smooth lines** apply a model to the data and then plot the model line
* **boxplots** calculate the first, second, and third quartiles of a data set and then plot those summary statistics (among others)
* **bar charts** and **histograms** bin the raw data and then plot bin counts
* **smooth lines** apply a model to the raw data and then plot the model line
* **boxplots** calculate the quartiles of the raw data and then plot the quartiles as a box.
* and so on.
`ggplot2` calls these algorithms _stats_, which is short for statistical transformation. Each geom in `ggplot2` is paired with a stat; although for some geoms, like points, the stat is the identity transformation, i.e. no transformation.
`ggplot2` calls the algorithm that a graph uses to transform raw data a _stat_, which is short for statistical transformation.
Each geom in `ggplot2` is designed to use a default stat when it creates a graph (if a geom plots the raw data it uses the "identity" stat, i.e. an identity transformation). In many cases, it does not make sense to change a geom's default stat. In other cases, you can change or fine tune the stat to make new graphs.
When you use a geom, `ggplot2` automatically applies the geom's stat algorithm in the background. You don't need to worry about the details or even think much about stats.
***
However, you can change or fine tune a geom's default stat to create interesting and useful plots.
*Tip*: To learn which stat a geom uses, visit the geom's help page, e.g. `?geom_bar`. To learn more about a stat, visit the stat's help page, e.g. `?stat_bin`.
To learn which stat a geom uses, visit the geom's help page. For example, the `?geom_bar` help page shows that `geom_bar()` uses the `stat_bin()` stat by default. To learn about the stat, visit the stat's help page.
***
To change a geom's stat, set the `stat` argument to the name of a stat in `ggplot2`. You can find a list of these stats by running `help(package = "ggplot2")`. Each stat is represented by a function that begins with `stat_`. The name of the stat is the portion of the function name that appears after `stat_`.
#### Change a stat
Many combinations of geoms and stats will create incompatible results. However, one useful non-default combination is to pair `geom_bar()` with `stat_identity()`. This combination let's you map the height of each bar to the value of a variable.
You can map the heights of bars in a bar chart to data values---not counts---by changing the stat of the bar chart. This works best if your data set contains one observation per bar, e.g.
```{r}
demo <- data.frame(
a = c(1,2,3),
a = c("bar_1","bar_2","bar_3"),
b = c(20, 30, 40)
)
ggplot(data = demo) +
geom_bar(aes(x = a, y = b), stat = "identity")
```
#### ..variables..
Many stats in `ggplot2` create more data than they display. For example, the `?stat_bin` help page explains that `stat_bin()` uses your raw data to create a new data frame with four columns: `count`, `density`, `ncount`, `ndensity`.
`geom_bar()` maps one of these columns, the `count` column to the y axis of your plot. You can map any of the three remaining columns to your y axis as well. To do this, map the y aesthetic to the stat column name, and surround the column name with a pair of dots, `..`.
By default, `geom_bar()` uses the bin stat, which creates a count for each bar.
```{r}
ggplot(data = demo) +
geom_bar(mapping = aes(x = a))
```
To change the stat of a geom, set its `stat` argument to the name of a stat. You may need to supply or remove mappings to accomodate the new stat.
```{r}
ggplot(data = demo) +
geom_bar(mapping = aes(x = a, y = b), stat = "identity")
```
To find a list of available stats, run `help(package = "ggplot2")`. Each stat is listed as a function that begins with `stat_`. Set a geom's stat argument to the part of the function name that follows the underscore, surrounded in quotes, as above.
Use consideration when you change a stat. Many combinations of geoms and stats create incompatible results.
#### Set parameters
Many stats use _parameters_ arguments that fine tune the statistical transformation. For example, the bin stat takes the parameter `width`, which controls the width of the bars in a bar chart.
To set a parameter of a stat, pass the parameter as an argument to the geom function.
```{r}
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut), width = 1)
```
To learn which parameters are used by a stat, visit the stat's help page, e.g. `?stat_bin`.
#### Use data from a stat
Many stats in `ggplot2` create more data than they display. For example, the `?stat_bin` help page explains that the `stat_bin()` transformation creates four new variables: `count`, `density`, `ncount`, and `ndensity`. `geom_bar()` uses only one of these variables. It maps the `count` variable to the y axis of your plot.
You can use any of the variables created by a stat in an aesthetic mapping. To use a variable created by a stat, surround its name with a pair of dots, `..`.
```{r message = FALSE, fig.show='hold', fig.width=4, fig.height=4}
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, y = ..density..))
geom_bar(mapping = aes(x = carat))
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = carat, y = ..density..))
```
Note that to do this, you will need to
@ -586,117 +680,16 @@ Note that to do this, you will need to
2. Determine which variables the stat creates from its help page
3. Surround the variable name with `..`
Also note that this procedure will not make sense in as many cases as you suppose. Usually stat columns exist for a very narrow purpose. For example, it does not make sense to use `..density..` in a bar chart of discrete values, but `..density..` is a useful alternative to `..count..` when you use a histogram geom (histograms rely on the same stat as bar charts).
```{r message = FALSE, fig.show='hold', fig.width=4, fig.height=4}
ggplot(data = diamonds) +
geom_histogram(mapping = aes(x = carat))
ggplot(data = diamonds) +
geom_histogram(mapping = aes(x = carat, y = ..density..))
```
### Parameters
Two of the graphs in the last section used the `width` argument. `width` is a _parameter_ of the `geom_bar()` function, a piece of information that `ggplot2` uses to build the geom.
How do these two plots differ?
```{r echo = FALSE, message = FALSE, fig.show='hold', fig.width=4, fig.height=4}
ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) +
geom_point() +
geom_smooth()
ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) +
geom_point() +
geom_smooth(method = lm)
```
Each overlays a smooth geom on a points geom, but each displays a different "type" of smooth line. In the first graph, `ggplot2` draws the result of a loess algorithm. In the second plot, `ggplot2` draws the result of a linear regression.
You can customize the output of `geom_smooth()` with its `method` argument. Set `method` to the name of a model function in R. `geom_smooth()` will display the result of modelling y on x with the function. In the graph above, we set `method = lm` to create the regression line. `lm()` is the R function that builds linear models.
```{r eval=FALSE}
ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) +
geom_point() +
geom_smooth(method = lm)
```
`method` is a _parameter_ of the `geom_smooth()` function, a piece of information that `ggplot2` uses to build the geom. If you do not set the `method` parameter, `ggplot2` defaults to a loess model or a general additive model depending on how many points appear in the graph.
`se` is another parameter of `geom_smooth()`. You can set the `se` parameter of `geom_smooth()` to `FALSE` to prevent `ggplot2` from drawing the standard error band that appears around the smooth line, i.e.
```{r }
ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) +
geom_point() +
geom_smooth(method = lm, se = FALSE)
```
Parameters are different than mappings because you do not set a parameter to a variable in the data set. `ggplot2` uses the value of a parameter directly. In contrast, to use a mapping, `ggplot2` must create a system of equivalencies between values of a variable and levels of an aesthetic.
##### Aesthetics as parameters
The distinction between parameters and mappings makes it easy to customize your graphs. Suppose you want to make a graph like the one below. How would you do it?
```{r echo = FALSE}
ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) +
geom_point(color = "blue")
```
If you add `color = "blue"` to the mappings argument, you will get an unexpected result.
```{r}
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, color = "blue"))
```
`ggplot2` treats `color = "blue"` as a mapping. It assumes that "blue" is a value in the data space. It uses R's recycling rules to assign the single value "blue" to each row of data. Then it creates a mapping from the value "blue" in the data space to the pinkish color that we see in the visual space. It even creates a legend to let you know that the color pink represents the value "blue." The choice of pink is a coincidence; `ggplot2` defaults to pink whenever a single discrete value is mapped to the color aesthetic.
This is not what we want. We want to set the color to blue. In short, we want to treat the color of the points like a parameter and set it directly.
To set an aesthetic as if it were a parameter, set it _outside_ of the `mapping` argument. This will place it outside of the `aes()` function as well.
```{r}
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy), color = "blue")
```
`ggplot2` will treat assignments that appear in the `aes()` call of the mapping argument as mappings. It will treat assignments that appear outside of the mappign argument as parameters.
As with aesthetics, different geoms respond to different parameters. How do you know which parameters to use with a geom? You can always treat a geom's aesthetics as parameters. You can also spot additional parameters by identifying a geom's stat.
### Stats
How does `ggplot2` know where to place the line in our smooth plot?
```{r echo = FALSE, message = FALSE, fig.show='hold', fig.width=4, fig.height=4}
ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) +
geom_point() +
geom_smooth()
```
The y values of the line do not appear in our data set, nor did we give the y values to `ggplot2`. `ggplot2` calculated they y values by applying an algorithm to the data. In this case, `ggplot2` applied a smoothing algorithm to the data.
Many types of graphs plot information that does not appear in the raw data. To do this, the graph first applies an algorithm to the raw data and then plots the results. For example, a boxplot calculates the first, second, and third quartiles of a data set and then plots those summary statistics (among others). A histogram bins the raw data and then counts how many points fall into each bin. It plots those counts on the y axis.
`ggplot2` calls these algorithms _stats_, which is short for statistical transformation. Stats are handled automatically in `ggplot2`. Not every geom uses a stat; but when one does, `ggplot2` will apply the stat in the background.
You can fine tune how a geom implements a stat by passing the geom parameters for the stat to use. To discover which stat a geom uses, visit the geom's help page.
For example, the `?geom_smooth` help page shows that `geom_smooth()` uses the `stat_smooth()` stat by default. If you then open the `?stat_smooth` help page, you will see that `stat_smooth()` takes the arguments `method` and `se` among others. With `ggplot2`, you can supply arguments to the stat called by a geom, by passing the arguments as parameters to the geom.
***
In general practice, you do not need to worry much about stats. Usually one geom will be closely associated with one stat, and `ggplot2` will implement the stat by default. However, stats are an integral part of the `ggplot2` package that you are welcome to modify. To learn more about `ggplot2`'s stat system, see [ggplot2: Elegant Graphics for Data Analysis](http://www.amazon.com/dp/0387981403/ref=cm_sw_su_dp?tag=ggplot2-20).
### Position
What if you didn't want a stacked bar chart? What if you wanted the chart below? Could you make it?
At the beginning of this section, you learned how to use the fill aesthetic to make a stacked bar chart.
```{r}
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, fill = clarity))
```
But what if you don't want a stacked bar chart? What if you want the chart below? Could you make it?
```{r echo = FALSE}
ggplot(data = diamonds) +
@ -803,7 +796,7 @@ ggplot(data = mpg) +
### Coordinate systems
You can make your bar charts even more versatile by changing the coordinate system of your plot. For example, you could flip the x and y axes of your plot, or you could plot your bar chart on polar coordinates, which creates a coxcomb plot.
You can make your bar charts even more versatile by changing the coordinate system of your plot. For example, you could flip the x and y axes of your plot, or you could plot your bar chart on polar coordinates to make a coxcomb plot or a polar clock chart.
```{r echo = FALSE, message = FALSE, fig.show='hold', fig.width=3, fig.height=4}
ggplot(data = diamonds) +
@ -814,6 +807,9 @@ ggplot(data = diamonds) +
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, fill = cut), width = 1) +
coord_polar()
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, fill = cut), width = 1) +
coord_polar(theta = "y")
```
To change the coordinate system of your plot, add a `coordinate_` function to your plot call. `ggplot2` comes with seven coordinate functions that each implement a different coordinate system.
@ -879,9 +875,7 @@ Add `coord_map()` or `coord_quickmap()` to plot map data on a cartographic proje
#### Polar coordinates
Add `coord_polar()` to your plot to plot your data in polar coordinates. By default, `ggplot2` will map your y variable to $r$ and your x variable to $\theta$. Reverse this behavior with the argument `theta = "y"`.
You can also use the `start` argument to control where in the plot your data starts, from 0 to 12 (o'clock), and the `direction` argument to control the orientation of the plot (1 for clockwise, -1 for anti-clockwise).
Add `coord_polar()` to your plot to plot your data in polar coordinates.
```{r}
ggplot(data = diamonds) +
@ -889,7 +883,17 @@ ggplot(data = diamonds) +
coord_polar()
```
Ignore `width = 1` for now. We will cover the argument in the section on parameters below.
By default, `ggplot2` will map your y variable to $r$ and your x variable to $\theta$. When applied to a bar chart, this creates a coxcomb plot.
Reverse this behavior with the argument `theta = "y"`. When applied to a bar chart, this creates a polar clock chart.
```{r}
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, fill = cut), width = 1) +
coord_polar(theta = "y")
```
You can also use the `start` argument to control where in the plot your data starts, from 0 to 12 (o'clock), and the `direction` argument to control the orientation of the plot (1 for clockwise, -1 for anti-clockwise).
***
@ -924,6 +928,8 @@ ggplot(data = mpg) +
## The Grammar of Graphics
> "Wax on. Wax off."---Mr. Miyagi. *The Karate Kid* (1984)
### Layers
## Visualizing Distributions