diff --git a/transform.Rmd b/transform.Rmd index ce142ee..3193d91 100644 --- a/transform.Rmd +++ b/transform.Rmd @@ -813,6 +813,9 @@ daily %>% Which is more important: arrival delay or departure delay? +1. Our definition of cancelled flights (`!is.na(dep_delay) & !is.na(arr_delay)` + ) is slightly sup-optimal. Why? Which is the most important column? + 1. Look at the number of cancelled flights per day. Is there a pattern? Is the proportion of cancelled flights related to the average delay?