From 90bd4fb1d137f90c23ecbee9f5cd313355f259e2 Mon Sep 17 00:00:00 2001 From: Garrett Date: Thu, 7 Apr 2016 14:26:17 -0400 Subject: [PATCH] Small copy edits edits to relational-data.Rmd --- relational-data.Rmd | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/relational-data.Rmd b/relational-data.Rmd index c81881e..9b8430c 100644 --- a/relational-data.Rmd +++ b/relational-data.Rmd @@ -20,11 +20,11 @@ To work with relational data you need verbs that work with pairs of tables. Ther * __Set operations__, which treat observations like they were set elements. -The most common place to find relational data is in a _relational_ database management system, a term that encompasses almost all modern databases. If you've used a database before, you've almost certainly used SQL. If so, you should find the concepts in this chapter familiar, although their expression in dplyr is a little different. Generally, dplyr is a little easier to use than SQL because it's specialised to data analysis: it makes common data analysis operations easier, at the expense of making it difficult to do other things. +The most common place to find relational data is in a _relational_ database management system, a term that encompasses almost all modern databases. If you've used a database before, you've almost certainly used SQL. If so, you should find the concepts in this chapter familiar, although their expression in dplyr is a little different. Generally, dplyr is a little easier to use than SQL because dplyr is specialised to data analysis: it makes common data analysis operations easier, at the expense of making it difficult to do other things. ## nycflights13 {#nycflights13-relational} -You'll learn about relational data with other datasets from the nycflights13 package. As well as the `flights` table that you've worked with so far, nycflights13 contains four other related data frames: +You can use the nycflights13 package to learn about relational data. nycflights13 contains four data frames that are related to the `flights` table that you used in Data Transformation: * `airlines` lets you look up the full carrier name from its abbreviated code: @@ -66,7 +66,7 @@ For nycflights13: connects to `airlines` with the `carrier` variable. * `flights` connects to `airports` in two ways: via the `origin` or the - `dest`. + `dest` variables. * `flights` connects to `weather` via `origin` (the location), and `year`, `month`, `day` and `hour` (the time). @@ -101,11 +101,10 @@ There are two types of keys: * A __primary key__ uniquely identifies an observation in its own table. For example, `planes$tailnum` is a primary key because it uniquely identifies - each plane. + each plane in the `planes` table. * A __foreign key__ uniquely identifies an observation in another table. - For example, the `flights$tailnum` is a foreign key because it matches each - flight to a unique plane. + For example, the `flights$tailnum` is a foreign key because it appears in the `flights` table where it matches each flight to a unique plane. A variable can be both part of primary key _and_ a foreign key. For example, `origin` is part of the `weather` primary key, and is also a foreign key for the `airport` table.