diff --git a/variation.Rmd b/variation.Rmd index 5f6c075..449acff 100644 --- a/variation.Rmd +++ b/variation.Rmd @@ -122,25 +122,25 @@ The strategy of counting the number of observations at each value breaks down fo To get around this, data scientists divide the range of a continuous variable into equally spaced intervals, a process called _binning_. ```{r, echo = FALSE} -knitr::include_graphics("images/visualization-17.png") +# knitr::include_graphics("images/visualization-17.png") ``` They then count how many observations fall into each bin. ```{r, echo = FALSE} -knitr::include_graphics("images/visualization-18.png") +# knitr::include_graphics("images/visualization-18.png") ``` And display the count as a bar, or some other object. ```{r, echo = FALSE} -knitr::include_graphics("images/visualization-19.png") +# knitr::include_graphics("images/visualization-19.png") ``` This method is temperamental because the appearance of the distribution can change dramatically if the bin size changes. As no bin size is "correct," you should explore several bin sizes when examining data. ```{r, echo = FALSE} -knitr::include_graphics("images/visualization-20.png") +# knitr::include_graphics("images/visualization-20.png") ``` Several geoms exist to help you visualize continuous distributions. They almost all use the "bin" stat to implement the above strategy. For each of these geoms, you can set the following arguments for "bin" to use: