diff --git a/model-building.Rmd b/model-building.Rmd index 861d0ab..02ea9d2 100644 --- a/model-building.Rmd +++ b/model-building.Rmd @@ -238,7 +238,7 @@ Note the change in the y-axis: now we are seeing the deviation from the expected ``` Our model fails to accurately predict the number of flights on Saturday: - during summer there are more flights than we expect, and during Fall there + during summer there are more flights than we expect, and during fall there are fewer. We'll see how we can do better to capture this pattern in the next section. @@ -287,7 +287,7 @@ daily %>% I suspect this pattern is caused by summer holidays: many people go on holiday in the summer, and people don't mind travelling on Saturdays for vacation. Looking at this plot, we might guess that summer holidays are from early June to late August. That seems to line up fairly well with the [state's school terms](http://schools.nyc.gov/Calendar/2013-2014+School+Year+Calendars.htm): summer break in 2013 was Jun 26--Sep 9. -Why are there more Saturday flights in the Spring than the Fall? I asked some American friends and they suggested that it's less common to plan family vacations during the Fall because of the big Thanksgiving and Christmas holidays. We don't have the data to know for sure, but it seems like a plausible working hypothesis. +Why are there more Saturday flights in spring than fall? I asked some American friends and they suggested that it's less common to plan family vacations during fall because of the big Thanksgiving and Christmas holidays. We don't have the data to know for sure, but it seems like a plausible working hypothesis. Lets create a "term" variable that roughly captures the three school terms, and check our work with a plot: