From 90470aa721cca5ad0bb34b69a48b1ad23ccdce0e Mon Sep 17 00:00:00 2001 From: mine-cetinkaya-rundel Date: Tue, 21 Feb 2023 08:54:08 -0500 Subject: [PATCH] Give a bit more direction --- EDA.qmd | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/EDA.qmd b/EDA.qmd index 76975c8..26de8cd 100644 --- a/EDA.qmd +++ b/EDA.qmd @@ -490,8 +490,9 @@ ggplot(mpg, aes(x = hwy, y = fct_reorder(class, hwy, median))) + What do you learn? How do you interpret the plots? -5. Compare and contrast `geom_violin()` with a faceted `geom_histogram()`, or a colored `geom_freqpoly()`. - What are the pros and cons of each method? +5. Create a visualization of diamond prices vs. a categorical variable from the `diamonds` dataset using `geom_violin()`, then a faceted `geom_histogram()`, and then a colored `geom_freqpoly()`. + Compare and contrast the three plots. + What are the pros and cons of each method of visualizing the distribution of a numerical variable based on the levels of a categorical variable? 6. If you have a small dataset, it's sometimes useful to use `geom_jitter()` to see the relationship between a continuous and categorical variable. The ggbeeswarm package provides a number of methods similar to `geom_jitter()`.