diff --git a/relational-data.Rmd b/relational-data.Rmd index f11c142..4699e02 100644 --- a/relational-data.Rmd +++ b/relational-data.Rmd @@ -95,11 +95,6 @@ For nycflights13: 3. `weather` only contains information for the origin (NYC) airports. If it contained weather records for all airports in the USA, what additional relation would it define with `flights`? -4. We know that some days of the year are "special", and fewer people than usual fly on them. - How might you represent that data as a data frame? - What would be the primary keys of that table? - How would it connect to the existing tables? - ## Keys The variables used to connect each pair of tables are called **keys**. @@ -165,7 +160,12 @@ For example, in this data there's a many-to-many relationship between airlines a 1. Add a surrogate key to `flights`. -2. Identify the keys in the following datasets +2. We know that some days of the year are "special", and fewer people than usual fly on them. + How might you represent that data as a data frame? + What would be the primary keys of that table? + How would it connect to the existing tables? + +3. Identify the keys in the following datasets a. `Lahman::Batting`, b. `babynames::babynames` @@ -175,7 +175,7 @@ For example, in this data there's a many-to-many relationship between airlines a (You might need to install some packages and read some documentation.) -3. Draw a diagram illustrating the connections between the `Batting`, `People`, and `Salaries` tables in the Lahman package. +4. Draw a diagram illustrating the connections between the `Batting`, `People`, and `Salaries` tables in the Lahman package. Draw another diagram that shows the relationship between `People`, `Managers`, `AwardsManagers`. How would you characterise the relationship between the `Batting`, `Pitching`, and `Fielding` tables?