Small copy edits edits to relational-data.Rmd

This commit is contained in:
Garrett 2016-04-07 14:26:17 -04:00
parent 67516034f7
commit 90bd4fb1d1
1 changed files with 5 additions and 6 deletions

View File

@ -20,11 +20,11 @@ To work with relational data you need verbs that work with pairs of tables. Ther
* __Set operations__, which treat observations like they were set elements.
The most common place to find relational data is in a _relational_ database management system, a term that encompasses almost all modern databases. If you've used a database before, you've almost certainly used SQL. If so, you should find the concepts in this chapter familiar, although their expression in dplyr is a little different. Generally, dplyr is a little easier to use than SQL because it's specialised to data analysis: it makes common data analysis operations easier, at the expense of making it difficult to do other things.
The most common place to find relational data is in a _relational_ database management system, a term that encompasses almost all modern databases. If you've used a database before, you've almost certainly used SQL. If so, you should find the concepts in this chapter familiar, although their expression in dplyr is a little different. Generally, dplyr is a little easier to use than SQL because dplyr is specialised to data analysis: it makes common data analysis operations easier, at the expense of making it difficult to do other things.
## nycflights13 {#nycflights13-relational}
You'll learn about relational data with other datasets from the nycflights13 package. As well as the `flights` table that you've worked with so far, nycflights13 contains four other related data frames:
You can use the nycflights13 package to learn about relational data. nycflights13 contains four data frames that are related to the `flights` table that you used in Data Transformation:
* `airlines` lets you look up the full carrier name from its abbreviated
code:
@ -66,7 +66,7 @@ For nycflights13:
connects to `airlines` with the `carrier` variable.
* `flights` connects to `airports` in two ways: via the `origin` or the
`dest`.
`dest` variables.
* `flights` connects to `weather` via `origin` (the location), and
`year`, `month`, `day` and `hour` (the time).
@ -101,11 +101,10 @@ There are two types of keys:
* A __primary key__ uniquely identifies an observation in its own table.
For example, `planes$tailnum` is a primary key because it uniquely identifies
each plane.
each plane in the `planes` table.
* A __foreign key__ uniquely identifies an observation in another table.
For example, the `flights$tailnum` is a foreign key because it matches each
flight to a unique plane.
For example, the `flights$tailnum` is a foreign key because it appears in the `flights` table where it matches each flight to a unique plane.
A variable can be both part of primary key _and_ a foreign key. For example, `origin` is part of the `weather` primary key, and is also a foreign key for the `airport` table.