diff --git a/diagrams/relational-nycflights.png b/diagrams/relational-nycflights.png
deleted file mode 100644
index 10b04ce..0000000
Binary files a/diagrams/relational-nycflights.png and /dev/null differ
diff --git a/diagrams/relational.graffle b/diagrams/relational.graffle
index ec63ac3..452e14e 100644
Binary files a/diagrams/relational.graffle and b/diagrams/relational.graffle differ
diff --git a/diagrams/relational.png b/diagrams/relational.png
new file mode 100644
index 0000000..40cc9b1
Binary files /dev/null and b/diagrams/relational.png differ
diff --git a/joins.qmd b/joins.qmd
index 7976c13..4063009 100644
--- a/joins.qmd
+++ b/joins.qmd
@@ -9,25 +9,21 @@ status("restructuring")
 
 ## Introduction
 
-Waiting on <https://github.com/tidyverse/dplyr/pull/5910>
-
-<!-- TODO: redraw all diagrams to match O'Reilly style -->
+<!-- TODO: redraw all diagrams to match O'Reilly style. From one to many on -->
 
 It's rare that a data analysis involves only a single data frame.
 Typically you have many data frames, and you must **join** them together to answer the questions that you're interested in.
-
 All the verbs in this chapter use a pair of data frames.
-Fortunately this is enough, since you can combine three data frames by combining two pairs.
-Sometimes both elements of a pair will be the same data frame.
-This is needed if, for example, you have a data frame of people, and each person has a reference to their parents.
+Fortunately this is enough, since you can solve any more complex problem a pair at a time.
 
-There are two important types of joins.
-**Mutating joins** adds new variables to one data frame from matching observations in another.
-**Filtering joins**, which filters observations from one data frame based on whether or not they match an observation in another.
+You'll learn about important types of joins in this chapter:
 
-If you're familiar with SQL, you should find these ideas very familiar as their realization in dplyr is very similar.
+-   **Mutating joins** add new variables to one data frame from matching observations in another.
+-   **Filtering joins**, filters observations from one data frame based on whether or not they match an observation in another.
+
+If you're familiar with SQL, you should find the ideas in this chapter familiar, as their realization in dplyr is very similar.
 We'll point out any important differences as we go.
-Don't worry if you're not familiar with SQL, we'll back to it in @sec-import-databases.
+Don't worry if you're not familiar with SQL as you'll learn more about it in @sec-import-databases.
 
 ### Prerequisites
 
@@ -43,7 +39,7 @@ library(nycflights13)
 
 ## nycflights13 {#sec-nycflights13-relational}
 
-nycflights13 contains five tibbles : `airlines`, `airports`, `weather` and `planes` which are all related to the `flights` data frame that you used in @sec-data-transform on data transformation:
+As well as the `flights` data frame that you used in @sec-data-transform, four addition related tibbles:
 
 -   `airlines` lets you look up the full carrier name from its abbreviated code:
 
@@ -71,13 +67,13 @@ nycflights13 contains five tibbles : `airlines`, `airports`, `weather` and `plan
 
 These datasets are connected as follows:
 
--   `flights` connects to `planes` via a single variable, `tailnum`.
+-   `flights` connects to `planes` through the `tailnum`.
 
 -   `flights` connects to `airlines` through the `carrier` variable.
 
--   `flights` connects to `airports` in two ways: via the `origin` and `dest` variables.
+-   `flights` connects to `airports` in two ways: through the origin (`origin)` and through the destination (`dest)`.
 
--   `flights` connects to `weather` via `origin` (the location), and `year`, `month`, `day` and `hour` (the time).
+-   `flights` connects to `weather` through two variables at the same time: the location (`origin)` and the time (`time_hour`).
 
 One way to show the relationships between the different data frames is with a diagram, as in @fig-flights-relationships.
 This diagram is a little overwhelming, but it's simple compared to some you'll see in the wild!
@@ -87,20 +83,22 @@ You don't need to understand the whole thing; you just need to understand the ch
 ```{r}
 #| label: fig-flights-relationships
 #| echo: false
+#| out-width: ~
 #| fig-cap: >
-#|   Connections between all six data frames in the nycflights package.
+#|   Connections between all five data frames in the nycflights package.
 #| fig-alt: >
 #|   Diagram showing the relationships between airports, planes, flights, 
 #|   weather, and airlines datasets from the nycflights13 package. The faa
 #|   variable in the airports data frame is connected to the origin and dest
 #|   variables in the flights data frame. The tailnum variable in the planes
-#|   data frame is connected to the tailnum variable in flights. The year,
-#|   month, day, hour, and origin variables are connected to the variables
-#|   with the same name in the flights data frame. And finally the carrier
-#|   variables in the airlines data frame is connected to the carrier
-#|   variable in the flights data frame. There are no direct connections
-#|   between airports, planes, airlines, and weather data frames.
-knitr::include_graphics("diagrams/relational-nycflights.png")
+#|   data frame is connected to the tailnum variable in flights. The
+#|   time_hour and origin variables in the weather data frame are connected
+#|   to the variables with the same name in the flights data frame. And
+#|   finally the carrier variables in the airlines data frame is connected
+#|   to the carrier variable in the flights data frame. There are no direct
+#|   connections between airports, planes, airlines, and weather data 
+#|   frames.
+knitr::include_graphics("diagrams/relational.png", dpi = 270)
 ```
 
 ### Exercises
@@ -122,7 +120,7 @@ A key is a variable (or set of variables) that uniquely identifies an observatio
 In simple cases, a single variable is sufficient to identify an observation.
 For example, each plane is uniquely identified by its `tailnum`.
 In other cases, multiple variables may be needed.
-For example, to identify an observation in `weather` you need five variables: `year`, `month`, `day`, `hour`, and `origin`.
+For example, to identify an observation in `weather` you need two variables: `time_hour` and `origin`.
 
 There are two types of keys:
 
@@ -144,26 +142,22 @@ planes |>
   filter(n > 1)
 
 weather |> 
-  count(year, month, day, hour, origin) |> 
+  count(time_hour, origin) |> 
   filter(n > 1)
 ```
 
-Sometimes a data frame doesn't have an explicit primary key: each row is an observation, but no combination of variables reliably identifies it.
-For example, what's the primary key in the `flights` data frame?
-You might think it would be the date plus the flight or tail number, but neither of those are unique:
+Sometimes a data frame doesn't have an explicit primary key and only an unwieldy combination of variables reliably identifies an observation.
+For example, to uniquely identify a flight, we need the hour the flight departs, the carrier, and the flight number:
 
 ```{r}
 flights |> 
-  count(year, month, day, flight) |> 
-  filter(n > 1)
-
-flights |> 
-  count(year, month, day, tailnum) |> 
+  count(time_hour, carrier, flight) |> 
   filter(n > 1)
 ```
 
 When starting to work with this data, we had naively assumed that each flight number would be only used once per day: that would make it much easier to communicate problems with a specific flight.
-Unfortunately that is not the case!
+Unfortunately that is not the case, and we have to assume that flight number will never to re-used within a hour.
+
 If a data frame lacks a primary key, it's sometimes useful to add one with `mutate()` and `row_number()`.
 That makes it easier to match observations if you've done some filtering and want to check back in with the original data.
 This is called a **surrogate key**.
@@ -180,12 +174,15 @@ For example, in this data there's a many-to-many relationship between airlines a
 
 1.  Add a surrogate key to `flights`.
 
-2.  We know that some days of the year are "special", and fewer people than usual fly on them.
+2.  The year, month, day, hour, and origin variables almost form a compound key for weather, but there's one hour that has duplicate observations.
+    Can you figure out what's special about this time?
+
+3.  We know that some days of the year are "special", and fewer people than usual fly on them.
     How might you represent that data as a data frame?
     What would be the primary keys of that data frame?
     How would it connect to the existing data frames?
 
-3.  Identify the keys in the following datasets
+4.  Identify the keys in the following datasets
 
     a.  `Lahman::Batting`
     b.  `babynames::babynames`
@@ -195,7 +192,7 @@ For example, in this data there's a many-to-many relationship between airlines a
 
     (You might need to install some packages and read some documentation.)
 
-4.  Draw a diagram illustrating the connections between the `Batting`, `People`, and `Salaries` data frames in the Lahman package.
+5.  Draw a diagram illustrating the connections between the `Batting`, `People`, and `Salaries` data frames in the Lahman package.
     Draw another diagram that shows the relationship between `People`, `Managers`, `AwardsManagers`.
 
     How would you characterise the relationship between the `Batting`, `Pitching`, and `Fielding` data frames?