More about lists

2022-11-08 16:19:14 -06:00 · 2022-11-08 16:19:14 -06:00 · 3b9d54db7a
parent 0973a0dea8
commit 3b9d54db7a
1 changed files with 59 additions and 18 deletions
--- a/iteration.qmd
+++ b/iteration.qmd
@ -406,10 +406,10 @@ You could do it with copy and paste:
 ```{r}
 #| eval: false
-data2019 <- readr::read_excel("data/y2019.xls")
+data2019 <- readr::read_excel("data/y2019.xlsx")
-data2020 <- readr::read_excel("data/y2020.xls")
+data2020 <- readr::read_excel("data/y2020.xlsx")
-data2021 <- readr::read_excel("data/y2021.xls")
+data2021 <- readr::read_excel("data/y2021.xlsx")
-data2022 <- readr::read_excel("data/y2022.xls")
+data2022 <- readr::read_excel("data/y2022.xlsx")
 ```
 And then use `dplyr::bind_rows()` to combine them all together:
@ -448,21 +448,45 @@ paths
 ### Lists
-Now that we have these 12 paths, we could call `read_excel()` 12 times to get 12 data frames.
+Now that we have these 12 paths, we could call `read_excel()` 12 times to get 12 data frames:
 In general, we won't know how files there are to read, so instead of saving each data frame to its own variable, we'll put them all into a list, something like this:
 ```{r}
 #| eval: false
-list(
+gapminder_1952 <- readxl::read_excel("data/gapminder/1952.xlsx")
-  readxl::read_excel("data/gapminder/1952.xls"),
+gapminder_1957 <- readxl::read_excel("data/gapminder/1957.xlsx")
-  readxl::read_excel("data/gapminder/1957.xls"),
+gapminder_1962 <- readxl::read_excel("data/gapminder/1962.xlsx")
-  readxl::read_excel("data/gapminder/1962.xls"),
+ ...
 gapminder_2007 <- readxl::read_excel("data/gapminder/2007.xlsx")
 ```
 But putting each sheet into its own variable is going to make it hard to work them a few steps down the road.
 Instead, they'll be easier to work with if we put them into a single object.
 A list is the perfect tool for this job:
 ```{r}
 #| eval: false
 files <- list(
  readxl::read_excel("data/gapminder/1952.xlsx"),
  readxl::read_excel("data/gapminder/1957.xlsx"),
  readxl::read_excel("data/gapminder/1962.xlsx"),
  ...,
-  readxl::read_excel("data/gapminder/2007.xls")
+  readxl::read_excel("data/gapminder/2007.xlsx")
 )
 ```
-Something about `[[`
+```{r}
 #| include: false
 files <- map(paths, readxl::read_excel)
 ```
 Now that you have these data frames in a list, how do you get one out?
 You can use `files[[i]]` to extract the ith element:
 ```{r}
 files[[3]]
 ```
 We'll come back to `[[` in more detail in @sec-subset-one.
 ### `purrr::map()` and `list_rbind()`
@ -530,17 +554,34 @@ The easiest way to do this is with the `set_names()` function, which can take a
 Here we use `basename()` to extract just the file name from the full path:
 ```{r}
-paths <- paths |> set_names(basename) 
+paths |> set_names(basename) 
 paths
 ```
 Those paths are automatically carried along by all the map functions, so the list of data frames will have those same names:
 ```{r}
 files <- paths |> 
  set_names(basename) |> 
  map(readxl::read_excel)
 ```
 That makes this call to `map()` shorthand for:
 ```{r}
 #| eval: false
-paths |> 
+files <- list(
-  map(readxl::read_excel) |> 
+  "1952.xlsx" = readxl::read_excel("data/gapminder/1952.xlsx"),
-  names()
+  "1957.xlsx" = readxl::read_excel("data/gapminder/1957.xlsx"),
  "1962.xlsx" = readxl::read_excel("data/gapminder/1962.xlsx"),
  ...,
  "2007.xlsx" = readxl::read_excel("data/gapminder/2007.xlsx")
 )
 ```
 You can also use `[[` to extract elements by name:
 ```{r}
 files[["1962.xlsx"]]
 ```
 Then we use the `names_to` argument to `list_rbind()` to tell it to save the names into a new column called `year` then use `readr::parse_number()` to extract the number from the string.
@ -921,7 +962,7 @@ unlink(by_clarity$paths)
 In this chapter you learn iteration tools to solve three problems that come up frequently when doing data science: manipulating multiple columns, reading multiple files, and saving multiple outputs.
 But in general, iteration is a super power: if you know the right iteration technique, you can easily go from fixing one problems to fixing any number of problems.
-Once you've mastered the techniques in this chapter, we highly recommend learning more by reading [Functionals chapter](https://adv-r.hadley.nz/functionals.html) of *Advanced R* and consulting the [purrr website](https://purrr.tidyverse.org and the).
+Once you've mastered the techniques in this chapter, we highly recommend learning more by reading [Functionals chapter](https://adv-r.hadley.nz/functionals.html) of *Advanced R* and consulting the [purrr website](https://purrr.tidyverse.org%20and%20the).
 If you know much about iteration in other languages you might be surprised that we didn't discuss the `for` loop.
 That comes up in the next chapter where we'll discuss some important base R functions.