diff --git a/EDA.qmd b/EDA.qmd index 76975c8..26de8cd 100644 --- a/EDA.qmd +++ b/EDA.qmd @@ -490,8 +490,9 @@ ggplot(mpg, aes(x = hwy, y = fct_reorder(class, hwy, median))) + What do you learn? How do you interpret the plots? -5. Compare and contrast `geom_violin()` with a faceted `geom_histogram()`, or a colored `geom_freqpoly()`. - What are the pros and cons of each method? +5. Create a visualization of diamond prices vs. a categorical variable from the `diamonds` dataset using `geom_violin()`, then a faceted `geom_histogram()`, and then a colored `geom_freqpoly()`. + Compare and contrast the three plots. + What are the pros and cons of each method of visualizing the distribution of a numerical variable based on the levels of a categorical variable? 6. If you have a small dataset, it's sometimes useful to use `geom_jitter()` to see the relationship between a continuous and categorical variable. The ggbeeswarm package provides a number of methods similar to `geom_jitter()`.