From fbb738e799784fe678f9fb56e6501e93959feb0a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Mine=20=C3=87etinkaya-Rundel?= Date: Sun, 21 Feb 2021 17:32:37 +0000 Subject: [PATCH] Enumerate exercise subparts with letters --- communicate-plots.Rmd | 10 +++---- iteration.Rmd | 26 ++++++++--------- relational-data.Rmd | 10 +++---- rmarkdown.Rmd | 6 ++-- strings.Rmd | 66 +++++++++++++++++++------------------------ tibble.Rmd | 11 +++----- transform.Rmd | 14 ++++----- vectors.Rmd | 16 ++++------- 8 files changed, 70 insertions(+), 89 deletions(-) diff --git a/communicate-plots.Rmd b/communicate-plots.Rmd index a240703..63dae53 100644 --- a/communicate-plots.Rmd +++ b/communicate-plots.Rmd @@ -495,11 +495,11 @@ Note that all colour scales come in two variety: `scale_colour_x()` and `scale_f 3. Change the display of the presidential terms by: - 1. Combining the two variants shown above. - 2. Improving the display of the y axis. - 3. Labelling each term with the name of the president. - 4. Adding informative plot labels. - 5. Placing breaks every 4 years (this is trickier than it seems!). + a. Combining the two variants shown above. + b. Improving the display of the y axis. + c. Labelling each term with the name of the president. + d. Adding informative plot labels. + e. Placing breaks every 4 years (this is trickier than it seems!). 4. Use `override.aes` to make the legend on the following plot easier to see. diff --git a/iteration.Rmd b/iteration.Rmd index c689a60..5a5da55 100644 --- a/iteration.Rmd +++ b/iteration.Rmd @@ -100,10 +100,10 @@ Then we'll move on some variations of the for loop that help you solve other pro 1. Write for loops to: - 1. Compute the mean of every column in `mtcars`. - 2. Determine the type of each column in `nycflights13::flights`. - 3. Compute the number of unique values in each column of `palmerpenguins::penguins`. - 4. Generate 10 random normals from distributions with means of -10, 0, 10, and 100. + a. Compute the mean of every column in `mtcars`. + b. Determine the type of each column in `nycflights13::flights`. + c. Compute the number of unique values in each column of `palmerpenguins::penguins`. + d. Generate 10 random normals from distributions with means of -10, 0, 10, and 100. Think about the output, sequence, and body **before** you start writing the loop. @@ -132,13 +132,9 @@ Then we'll move on some variations of the for loop that help you solve other pro 3. Combine your function writing and for loop skills: - 1. Write a for loop that `prints()` the lyrics to the children's song "Alice the camel". - - 2. Convert the nursery rhyme "ten in the bed" to a function. - Generalise it to any number of people in any sleeping structure. - - 3. Convert the song "99 bottles of beer on the wall" to a function. - Generalise to any number of any vessel containing any liquid on any surface. + a. Write a for loop that `prints()` the lyrics to the children's song "Alice the camel". + b. Convert the nursery rhyme "ten in the bed" to a function. Generalise it to any number of people in any sleeping structure. + c. Convert the song "99 bottles of beer on the wall" to a function. Generalise to any number of any vessel containing any liquid on any surface. 4. It's common to see for loops that don't preallocate the output and instead increase the length of a vector at each step: @@ -634,10 +630,10 @@ I focus on purrr functions here because they have more consistent names and argu 1. Write code that uses one of the map functions to: - 1. Compute the mean of every column in `mtcars`. - 2. Determine the type of each column in `nycflights13::flights`. - 3. Compute the number of unique values in each column of `palmerpenguins::penguins`. - 4. Generate 10 random normals from distributions with means of -10, 0, 10, and 100. + a. Compute the mean of every column in `mtcars`. + b. Determine the type of each column in `nycflights13::flights`. + c. Compute the number of unique values in each column of `palmerpenguins::penguins`. + d. Generate 10 random normals from distributions with means of -10, 0, 10, and 100. 2. How can you create a single vector that for each column in a data frame indicates whether or not it's a factor? diff --git a/relational-data.Rmd b/relational-data.Rmd index e11dadb..f11c142 100644 --- a/relational-data.Rmd +++ b/relational-data.Rmd @@ -167,11 +167,11 @@ For example, in this data there's a many-to-many relationship between airlines a 2. Identify the keys in the following datasets - 1. `Lahman::Batting`, - 2. `babynames::babynames` - 3. `nasaweather::atmos` - 4. `fueleconomy::vehicles` - 5. `ggplot2::diamonds` + a. `Lahman::Batting`, + b. `babynames::babynames` + c. `nasaweather::atmos` + d. `fueleconomy::vehicles` + e. `ggplot2::diamonds` (You might need to install some packages and read some documentation.) diff --git a/rmarkdown.Rmd b/rmarkdown.Rmd index c65e58b..ce108ac 100644 --- a/rmarkdown.Rmd +++ b/rmarkdown.Rmd @@ -124,9 +124,9 @@ If you forget, you can get to a handy reference sheet with *Help \> Markdown Qui 2. Using the R Markdown quick reference, figure out how to: - 1. Add a footnote. - 2. Add a horizontal rule. - 3. Add a block quote. + a. Add a footnote. + b. Add a horizontal rule. + c. Add a block quote. 3. Copy and paste the contents of `diamond-sizes.Rmd` from in to a local R markdown document. Check that you can run it, then add text after the frequency polygon that describes its most striking features. diff --git a/strings.Rmd b/strings.Rmd index 8da69d9..c89873c 100644 --- a/strings.Rmd +++ b/strings.Rmd @@ -314,10 +314,10 @@ For example, I'll search for `\bsum\b` to avoid matching `summarise`, `summary`, 2. Given the corpus of common words in `stringr::words`, create regular expressions that find all words that: - 1. Start with "y". - 2. End with "x" - 3. Are exactly three letters long. (Don't cheat by using `str_length()`!) - 4. Have seven letters or more. + a. Start with "y". + b. End with "x" + c. Are exactly three letters long. (Don't cheat by using `str_length()`!) + d. Have seven letters or more. Since this list is long, you might want to use the `match` argument to `str_view()` to show only the matching or non-matching words. @@ -360,14 +360,10 @@ str_view(c("grey", "gray"), "gr(e|a)y") 1. Create regular expressions to find all words that: - 1. Start with a vowel. - - 2. That only contain consonants. - (Hint: thinking about matching "not"-vowels.) - - 3. End with `ed`, but not with `eed`. - - 4. End with `ing` or `ise`. + a. Start with a vowel. + b. That only contain consonants. (Hint: thinking about matching "not"-vowels.) + c. End with `ed`, but not with `eed`. + d. End with `ing` or `ise`. 2. Empirically verify the rule "i before e except after c". @@ -423,16 +419,16 @@ str_view(x, 'C[LX]+?') 2. Describe in words what these regular expressions match: (read carefully to see if I'm using a regular expression or a string that defines a regular expression.) - 1. `^.*$` - 2. `"\\{.+\\}"` - 3. `\d{4}-\d{2}-\d{2}` - 4. `"\\\\{4}"` + a. `^.*$` + b. `"\\{.+\\}"` + c. `\d{4}-\d{2}-\d{2}` + d. `"\\\\{4}"` 3. Create regular expressions to find all words that: - 1. Start with three consonants. - 2. Have three or more vowels in a row. - 3. Have two or more vowel-consonant pairs in a row. + a. Start with three consonants. + b. Have three or more vowels in a row. + c. Have two or more vowel-consonant pairs in a row. 4. Solve the beginner regexp crosswords at . @@ -454,19 +450,17 @@ str_view(fruit, "(..)\\1", match = TRUE) 1. Describe, in words, what these expressions will match: - 1. `(.)\1\1` - 2. `"(.)(.)\\2\\1"` - 3. `(..)\1` - 4. `"(.).\\1.\\1"` - 5. `"(.)(.)(.).*\\3\\2\\1"` + a. `(.)\1\1` + b. `"(.)(.)\\2\\1"` + c. `(..)\1` + d. `"(.).\\1.\\1"` + e. `"(.)(.)(.).*\\3\\2\\1"` 2. Construct regular expressions to match words that: - 1. Start and end with the same character. - - 2. Contain a repeated pair of letters (e.g. "church" contains "ch" repeated twice.) - - 3. Contain one letter repeated in at least three places (e.g. "eleven" contains three "e"s.) + a. Start and end with the same character. + b. Contain a repeated pair of letters (e.g. "church" contains "ch" repeated twice.) + c. Contain one letter repeated in at least three places (e.g. "eleven" contains three "e"s.) ## Tools @@ -666,11 +660,9 @@ The second function will have the suffix `_all`. 1. For each of the following challenges, try solving it by using both a single regular expression, and a combination of multiple `str_detect()` calls. - 1. Find all words that start or end with `x`. - - 2. Find all words that start with a vowel and end with a consonant. - - 3. Are there any words that contain at least one of each different vowel? + a. Find all words that start or end with `x`. + b. Find all words that start with a vowel and end with a consonant. + c. Are there any words that contain at least one of each different vowel? 2. What word has the highest number of vowels? What word has the highest proportion of vowels? @@ -1048,8 +1040,8 @@ The main difference is the prefix: `str_` vs. `stri_`. 1. Find the stringi functions that: - 1. Count the number of words. - 2. Find duplicated strings. - 3. Generate random text. + a. Count the number of words. + b. Find duplicated strings. + c. Generate random text. 2. How do you control the language that `stri_sort()` uses for sorting? diff --git a/tibble.Rmd b/tibble.Rmd index ce41ede..92bf716 100644 --- a/tibble.Rmd +++ b/tibble.Rmd @@ -184,13 +184,10 @@ With tibbles, `[` always returns another tibble. 4. Practice referring to non-syntactic names in the following data frame by: - 1. Extracting the variable called `1`. - - 2. Plotting a scatterplot of `1` vs `2`. - - 3. Creating a new column called `3` which is `2` divided by `1`. - - 4. Renaming the columns to `one`, `two` and `three`. + a. Extracting the variable called `1`. + b. Plotting a scatterplot of `1` vs `2`. + c. Creating a new column called `3` which is `2` divided by `1`. + d. Renaming the columns to `one`, `two` and `three`. ```{r} annoying <- tibble( diff --git a/transform.Rmd b/transform.Rmd index bd7763c..73614ea 100644 --- a/transform.Rmd +++ b/transform.Rmd @@ -229,13 +229,13 @@ filter(df, is.na(x) | x > 1) 1. Find all flights that - 1. Had an arrival delay of two or more hours - 2. Flew to Houston (`IAH` or `HOU`) - 3. Were operated by United, American, or Delta - 4. Departed in summer (July, August, and September) - 5. Arrived more than two hours late, but didn't leave late - 6. Were delayed by at least an hour, but made up over 30 minutes in flight - 7. Departed between midnight and 6am (inclusive) + a. Had an arrival delay of two or more hours + b. Flew to Houston (`IAH` or `HOU`) + c. Were operated by United, American, or Delta + d. Departed in summer (July, August, and September) + e. Arrived more than two hours late, but didn't leave late + f. Were delayed by at least an hour, but made up over 30 minutes in flight + g. Departed between midnight and 6am (inclusive) 2. Another useful dplyr filtering helper is `between()`. What does it do? diff --git a/vectors.Rmd b/vectors.Rmd index aea855d..b54b9b6 100644 --- a/vectors.Rmd +++ b/vectors.Rmd @@ -412,14 +412,10 @@ The distinction between `[` and `[[` is most important for lists, as we'll see s 4. Create functions that take a vector as input and returns: - 1. The last value. - Should you use `[` or `[[`? - - 2. The elements at even numbered positions. - - 3. Every element except the last value. - - 4. Only even numbers (and no missing values). + a. The last value. Should you use `[` or `[[`? + b. The elements at even numbered positions. + c. Every element except the last value. + d. Only even numbers (and no missing values). 5. Why is `x[-which(x > 0)]` not the same as `x[x <= 0]`? @@ -561,8 +557,8 @@ knitr::include_graphics("images/pepper-3.jpg") 1. Draw the following lists as nested sets: - 1. `list(a, b, list(c, d), list(e, f))` - 2. `list(list(list(list(list(list(a))))))` + a. `list(a, b, list(c, d), list(e, f))` + b. `list(list(list(list(list(list(a))))))` 2. What happens if you subset a tibble as if you're subsetting a list? What are the key differences between a list and a tibble?