diff --git a/EDA.Rmd b/EDA.Rmd index 550e2ff..7fd255f 100644 --- a/EDA.Rmd +++ b/EDA.Rmd @@ -171,7 +171,7 @@ ggplot(diamonds) + geom_histogram(mapping = aes(x = y), binwidth = 0.5) ``` -There are so many observations in the common bins that the rare bins are so short that you can't see them (although maybe if you stare intently at 0 you'll spot something). To make it easy to see the unusual values, we need to zoom into to small values of the y-axis with `coord_cartesian()`: +There are so many observations in the common bins that the rare bins are so short that you can't see them (although maybe if you stare intently at 0 you'll spot something). To make it easy to see the unusual values, we need to zoom to small values of the y-axis with `coord_cartesian()`: ```{r} ggplot(diamonds) + @@ -262,7 +262,7 @@ ggplot(data = diamonds2, mapping = aes(x = x, y = y)) + geom_point(na.rm = TRUE) ``` -Other times you want to understand what makes observations with missing values different to observations with recorded values. For example, in `nycflights13::flights`, missing values in the `dep_time` variable indicate that the flight was cancelled. So you might want to compare the scheduled departure times for cancelled and non-cancelled times. You can do by making a new variable with `is.na()`. +Other times you want to understand what makes observations with missing values different to observations with recorded values. For example, in `nycflights13::flights`, missing values in the `dep_time` variable indicate that the flight was cancelled. So you might want to compare the scheduled departure times for cancelled and non-cancelled times. You can do this by making a new variable with `is.na()`. ```{r} nycflights13::flights %>%