r4ds/oreilly/intro.html

269 lines
32 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<section data-type="chapter" id="chp-intro">
<h1><span id="sec-intro" class="quarto-section-identifier d-none d-lg-block"><span class="chapter-title">Introduction</span></span></h1><p>Data science is an exciting discipline that allows you to transform raw data into understanding, insight, and knowledge. The goal of “R for Data Science” is to help you learn the most important tools in R that will allow you to do data science efficiently and reproducibly. After reading this book, youll have the tools to tackle a wide variety of data science challenges, using the best parts of R.</p>
<section id="what-you-will-learn" data-type="sect1">
<h1>
What you will learn</h1>
<p>Data science is a huge field, and theres no way you can master it all by reading a single book. The goal of this book is to give you a solid foundation in the most important tools, and enough knowledge to find the resources to learn more when necessary. Our model of the tools needed in a typical data science project looks something like <a href="#fig-ds-diagram" data-type="xref">#fig-ds-diagram</a>.</p>
<div class="cell">
<div class="cell-output-display">
<figure id="fig-ds-diagram"><p><img src="diagrams/data-science/base.png" alt="A diagram displaying the data science cycle: Import -&gt; Tidy -&gt; Understand (which has the phases Transform -&gt; Visualize -&gt; Model in a cycle) -&gt; Communicate. Surrounding all of these is Communicate. " width="535"/></p>
<figcaption>In our model of the data science process you start with data import and tidying. Next you understand your data with an iterative cycle of transforming, visualizing, and modeling. You finish the process by communicating your results to other humans.</figcaption>
</figure>
</div>
</div>
<p>First you must <strong>import</strong> your data into R. This typically means that you take data stored in a file, database, or web application programming interface (API), and load it into a data frame in R. If you cant get your data into R, you cant do data science on it!</p>
<p>Once youve imported your data, it is a good idea to <strong>tidy</strong> it. Tidying your data means storing it in a consistent form that matches the semantics of the dataset with the way it is stored. In brief, when your data is tidy, each column is a variable, and each row is an observation. Tidy data is important because the consistent structure lets you focus your efforts on answering questions about the data, not fighting to get the data into the right form for different functions.</p>
<p>Once you have tidy data, a common next step is to <strong>transform</strong> it. Transformation includes narrowing in on observations of interest (like all people in one city, or all data from the last year), creating new variables that are functions of existing variables (like computing speed from distance and time), and calculating a set of summary statistics (like counts or means). Together, tidying and transforming are called <strong>wrangling</strong>, because getting your data in a form thats natural to work with often feels like a fight!</p>
<p>Once you have tidy data with the variables you need, there are two main engines of knowledge generation: visualisation and modelling. These have complementary strengths and weaknesses so any real analysis will iterate between them many times.</p>
<p><strong>Visualisation</strong> is a fundamentally human activity. A good visualisation will show you things that you did not expect, or raise new questions about the data. A good visualisation might also hint that youre asking the wrong question, or that you need to collect different data. Visualisations can surprise you and they dont scale particularly well because they require a human to interpret them.</p>
<p>The last step of data science is <strong>communication</strong>, an absolutely critical part of any data analysis project. It doesnt matter how well your models and visualisation have led you to understand the data unless you can also communicate your results to others.</p>
<p>Surrounding all these tools is <strong>programming</strong>. Programming is a cross-cutting tool that you use in nearly every part of a data science project. You dont need to be an expert programmer to be a successful data scientist, but learning more about programming pays off, because becoming a better programmer allows you to automate common tasks, and solve new problems with greater ease.</p>
<p>Youll use these tools in every data science project, but for most projects theyre not enough. Theres a rough 80-20 rule at play; you can tackle about 80% of every project using the tools that youll learn in this book, but youll need other tools to tackle the remaining 20%. Throughout this book, well point you to resources where you can learn more.</p>
</section>
<section id="how-this-book-is-organised" data-type="sect1">
<h1>
How this book is organised</h1>
<p>The previous description of the tools of data science is organised roughly according to the order in which you use them in an analysis (although of course youll iterate through them multiple times). In our experience, however, learning data ingest and tidying first is sub-optimal, because 80% of the time its routine and boring, and the other 20% of the time its weird and frustrating. Thats a bad place to start learning a new subject! Instead, well start with visualisation and transformation of data thats already been imported and tidied. That way, when you ingest and tidy your own data, your motivation will stay high because you know the pain is worth the effort.</p>
<p>Within each chapter, we try and adhere to a similar pattern: start with some motivating examples so you can see the bigger picture, and then dive into the details. Each section of the book is paired with exercises to help you practice what youve learned. Although it can be tempting to skip the exercises, theres no better way to learn than practicing on real problems.</p>
</section>
<section id="what-you-wont-learn" data-type="sect1">
<h1>
What you wont learn</h1>
<p>There are a number of important topics that this book doesnt cover. We believe its important to stay ruthlessly focused on the essentials so you can get up and running as quickly as possible. That means this book cant cover every important topic.</p>
<section id="modeling" data-type="sect2">
<h2>
Modeling</h2>
<!--# TO DO: Say a few sentences about modelling. -->
<p>To learn more about modeling, we highly recommend <a href="https://www.tmwr.org">Tidy Modeling with R</a>, by our colleagues Max Kuhn and Julia Silge. This book will teach you the tidymodels family of packages, which, as you might guess from the name, share many conventions with the tidyverse packages we use in this book.</p>
</section>
<section id="big-data" data-type="sect2">
<h2>
Big data</h2>
<p>This book proudly focuses on small, in-memory datasets. This is the right place to start because you cant tackle big data unless you have experience with small data. The tools you learn in this book will easily handle hundreds of megabytes of data, and with a little care, you can typically use them to work with 1-2 Gb of data. If youre routinely working with larger data (10-100 Gb, say), you should learn more about <a href="https://github.com/Rdatatable/data.table">data.table</a>. This book doesnt teach data.table because it has a very concise interface that offers fewer linguistic cues, which makes it harder to learn. However, if youre working with large data, the performance payoff is well worth the effort required to learn it.</p>
<p>If your data is bigger than this, carefully consider whether your big data problem is actually a small data problem in disguise. While the complete data set might be big, often the data needed to answer a specific question is small. You might be able to find a subset, subsample, or summary that fits in memory and still allows you to answer the question that youre interested in. The challenge here is finding the right small data, which often requires a lot of iteration.</p>
<p>Another possibility is that your big data problem is actually a large number of small data problems in disguise. Each individual problem might fit in memory, but you have millions of them. For example, you might want to fit a model to each person in your dataset. This would be trivial if you had just 10 or 100 people, but instead you have a million. Fortunately, each problem is independent of the others (a setup that is sometimes called embarrassingly parallel), so you just need a system (like <a href="https://hadoop.apache.org/">Hadoop</a> or <a href="https://spark.apache.org/">Spark</a>) that allows you to send different datasets to different computers for processing. Once youve figured out how to answer your question for a single subset using the tools described in this book, you can learn new tools like <strong>sparklyr</strong> to solve it for the full dataset.</p>
</section>
<section id="python-julia-and-friends" data-type="sect2">
<h2>
Python, Julia, and friends</h2>
<p>In this book, you wont learn anything about Python, Julia, or any other programming language useful for data science. This isnt because we think these tools are bad. Theyre not! And in practice, most data science teams use a mix of languages, often at least R and Python.</p>
<p>However, we strongly believe that its best to master one tool at a time. You will get better faster if you dive deep, rather than spreading yourself thinly over many topics. This doesnt mean you should only know one thing, just that youll generally learn faster if you stick to one thing at a time. You should strive to learn new things throughout your career, but make sure your understanding is solid before you move on to the next interesting thing.</p>
<p>We think R is a great place to start your data science journey because it is an environment designed from the ground up to support data science. R is not just a programming language, it is also an interactive environment for doing data science. To support interaction, R is a much more flexible language than many of its peers. This flexibility comes with its downsides, but the big upside is how easy it is to evolve tailored grammars for specific parts of the data science process. These mini languages help you think about problems as a data scientist, while supporting fluent interaction between your brain and the computer.</p>
</section>
</section>
<section id="prerequisites" data-type="sect1">
<h1>
Prerequisites</h1>
<p>Weve made a few assumptions about what you already know in order to get the most out of this book. You should be generally numerically literate, and its helpful if you have some programming experience already. If youve never programmed before, you might find <a href="https://rstudio-education.github.io/hopr/">Hands on Programming with R</a> by Garrett to be a useful adjunct to this book.</p>
<p>There are four things you need to run the code in this book: R, RStudio, a collection of R packages called the <strong>tidyverse</strong>, and a handful of other packages. Packages are the fundamental units of reproducible R code. They include reusable functions, the documentation that describes how to use them, and sample data.</p>
<section id="r" data-type="sect2">
<h2>
R</h2>
<p>To download R, go to CRAN, the <strong>c</strong>omprehensive <strong>R</strong> <strong>a</strong>rchive <strong>n</strong>etwork. CRAN is composed of a set of mirror servers distributed around the world and is used to distribute R and R packages. Dont try and pick a mirror thats close to you: instead use the cloud mirror, <a href="https://cloud.r-project.org" class="uri">https://cloud.r-project.org</a>, which automatically figures it out for you.</p>
<p>A new major version of R comes out once a year, and there are 2-3 minor releases each year. Its a good idea to update regularly. Upgrading can be a bit of a hassle, especially for major versions, which require you to re-install all your packages, but putting it off only makes it worse. Youll need at least R 4.1.0 for this book.</p>
</section>
<section id="rstudio" data-type="sect2">
<h2>
RStudio</h2>
<p>RStudio is an integrated development environment, or IDE, for R programming. Download and install it from <a href="https://www.rstudio.com/download" class="uri">https://www.rstudio.com/download</a>. RStudio is updated a couple of times a year. When a new version is available, RStudio will let you know. Its a good idea to upgrade regularly so you can take advantage of the latest and greatest features. For this book, make sure you have at least RStudio 2022.02.0.</p>
<p>When you start RStudio, <a href="#fig-rstudio-console" data-type="xref">#fig-rstudio-console</a>, youll see two key regions in the interface: the console pane, and the output pane. For now, all you need to know is that you type R code in the console pane, and press enter to run it. Youll learn more as we go along!</p>
<div class="cell">
<div class="cell-output-display">
<figure id="fig-rstudio-console"><p><img src="diagrams/rstudio/console.png" alt="The RStudio IDE with the panes Console and Output highlighted." width="520"/></p>
<figcaption>The RStudio IDE has two key regions: type R code in the console pane on the left, and look for plots in the output pane on the right.</figcaption>
</figure>
</div>
</div>
</section>
<section id="the-tidyverse" data-type="sect2">
<h2>
The tidyverse</h2>
<p>Youll also need to install some R packages. An R <strong>package</strong> is a collection of functions, data, and documentation that extends the capabilities of base R. Using packages is key to the successful use of R. The majority of the packages that you will learn in this book are part of the so-called tidyverse. All packages in the tidyverse share a common philosophy of data and R programming, and are designed to work together naturally.</p>
<p>You can install the complete tidyverse with a single line of code:</p>
<div class="cell">
<pre data-type="programlisting" data-code-language="r">install.packages("tidyverse")</pre>
</div>
<p>On your own computer, type that line of code in the console, and then press enter to run it. R will download the packages from CRAN and install them on to your computer. If you have problems installing, make sure that you are connected to the internet, and that <a href="https://cloud.r-project.org/" class="uri">https://cloud.r-project.org/</a> isnt blocked by your firewall or proxy.</p>
<p>You will not be able to use the functions, objects, or help files in a package until you load it with <code><a href="https://rdrr.io/r/base/library.html">library()</a></code>. Once you have installed a package, you can load it using the <code><a href="https://rdrr.io/r/base/library.html">library()</a></code> function:</p>
<div class="cell">
<pre data-type="programlisting" data-code-language="r">library(tidyverse)
#&gt; ── Attaching packages ──────────────────────────────────── tidyverse 1.3.2 ──
#&gt; ✔ ggplot2 3.4.0.9000 ✔ purrr 0.9000.0.9000
#&gt; ✔ tibble 3.1.8 ✔ dplyr 1.0.99.9000
#&gt; ✔ tidyr 1.2.1.9001 ✔ stringr 1.4.1.9000
#&gt; ✔ readr 2.1.3 ✔ forcats 0.5.2
#&gt; ── Conflicts ─────────────────────────────────────── tidyverse_conflicts() ──
#&gt; ✖ dplyr::filter() masks stats::filter()
#&gt; ✖ dplyr::lag() masks stats::lag()</pre>
</div>
<p>This tells you that tidyverse is loading eight packages: ggplot2, tibble, tidyr, readr, purrr, dplyr, stringr, and forcats packages. These are considered to be the <strong>core</strong> of the tidyverse because youll use them in almost every analysis.</p>
<p>Packages in the tidyverse change fairly frequently. You can check whether updates are available, and optionally install them, by running <code><a href="https://tidyverse.tidyverse.org/reference/tidyverse_update.html">tidyverse_update()</a></code>.</p>
</section>
<section id="other-packages" data-type="sect2">
<h2>
Other packages</h2>
<p>There are many other excellent packages that are not part of the tidyverse, because they solve problems in a different domain, or are designed with a different set of underlying principles. This doesnt make them better or worse, just different. In other words, the complement to the tidyverse is not the messyverse, but many other universes of interrelated packages. As you tackle more data science projects with R, youll learn new packages and new ways of thinking about data.</p>
<p>In this book well use three data packages from outside the tidyverse:</p>
<div class="cell">
<pre data-type="programlisting" data-code-language="r">install.packages(c("nycflights13", "gapminder", "Lahman"))</pre>
</div>
<p>These packages provide data on airline flights, world development, and baseball that well use to illustrate key data science ideas.</p>
</section>
</section>
<section id="running-r-code" data-type="sect1">
<h1>
Running R code</h1>
<p>The previous section showed you several examples of running R code. Code in the book looks like this:</p>
<div class="cell">
<pre data-type="programlisting" data-code-language="r">1 + 2
#&gt; [1] 3</pre>
</div>
<p>If you run the same code in your local console, it will look like this:</p>
<pre><code>&gt; 1 + 2
[1] 3</code></pre>
<p>There are two main differences. In your console, you type after the <code>&gt;</code>, called the <strong>prompt</strong>; we dont show the prompt in the book. In the book, output is commented out with <code>#&gt;</code>; in your console it appears directly after your code. These two differences mean that if youre working with an electronic version of the book, you can easily copy code out of the book and into the console.</p>
<p>Throughout the book, we use a consistent set of conventions to refer to code:</p>
<ul><li><p>Functions are displayed in a code font and followed by parentheses, like <code><a href="https://rdrr.io/r/base/sum.html">sum()</a></code>, or <code><a href="https://rdrr.io/r/base/mean.html">mean()</a></code>.</p></li>
<li><p>Other R objects (such as data or function arguments) are in a code font, without parentheses, like <code>flights</code> or <code>x</code>.</p></li>
<li><p>Sometimes, to make it clear which package an object comes from, well use well use the package name followed by two colons, like <code><a href="https://dplyr.tidyverse.org/reference/mutate.html">dplyr::mutate()</a></code>, or<br/><code><a href="https://rdrr.io/pkg/nycflights13/man/flights.html">nycflights13::flights</a></code>. This is also valid R code.</p></li>
</ul></section>
<section id="acknowledgements" data-type="sect1">
<h1>
Acknowledgements</h1>
<p>This book isnt just the product of Hadley, Mine, and Garrett, but is the result of many conversations (in person and online) that weve had with many people in the R community. There are a few people wed like to thank in particular, because they have spent many hours answering our questions and helping us to better think about data science:</p>
<ul><li><p>Jenny Bryan and Lionel Henry for many helpful discussions around working with lists and list-columns.</p></li>
<li><p>The three chapters on workflow were adapted (with permission), from <a href="https://stat545.com/block002_hello-r-workspace-wd-project.html" class="uri">https://stat545.com/block002_hello-r-workspace-wd-project.html</a> by Jenny Bryan.</p></li>
<li><p>Yihui Xie for his work on the <a href="https://github.com/rstudio/bookdown">bookdown</a> package, and for tirelessly responding to my feature requests.</p></li>
<li><p>Bill Behrman for his thoughtful reading of the entire book, and for trying it out with his data science class at Stanford.</p></li>
<li><p>The #rstats Twitter community who reviewed all of the draft chapters and provided tons of useful feedback.</p></li>
</ul><p>This book was written in the open, and many people contributed pull requests to fix minor problems. Special thanks goes to everyone who contributed via GitHub:</p>
<div class="cell">
</div>
<p>A big thank you to all 212 people who contributed specific improvements via GitHub pull requests (in alphabetical order by username): Alex (@ALShum), A. s. (@Adrianzo), @AlanFeder, Antti Rask (@AnttiRask), Oluwafemi OYEDELE (@BB1464), Brian G. Barkley (@BarkleyBG), Bianca Peterson (@BinxiePeterson), Birger Niklas (@BirgerNi), David Clark (@DDClark), @DSGeoff, Edwin Thoen (@EdwinTh), Eric Kitaif (@EricKit), Gerome Meyer (@GeroVanMi), Josh Goldberg (@GoldbergData), Iain (@Iain-S), Jeffrey Stevens (@JeffreyRStevens), 蒋雨蒙 (@JeldorPKU), @MJMarshall, Kara de la Marck (@MarckK), Matt Wittbrodt (@MattWittbrodt), Jakub Nowosad (@Nowosad), Y. Yu (@PursuitOfDataScience), Jajo (@RIngyao), Richard Knight (@RJHKnight), Ranae Dietzel (@Ranae), @ReeceGoding, Robin (@Robinlovelace), Rod Mazloomi (@RodAli), Rohan Alexander (@RohanAlexander), Romero Morais (@RomeroBarata), Shannon Ellis (@ShanEllis), Christian Heinrich (@Shurakai), Steven M. Mortimer (@StevenMMortimer), @a-rosenberg, Tim Becker (@a2800276), Adam Gruer (@adam-gruer), adi pradhan (@adidoit), Andrea Gilardi (@agila5), Ajay Deonarine (@ajay-d), @aleloi, pete (@alonzi), Andrew M. (@amacfarland), Andrew Landgraf (@andland), Angela Li (@angela-li), @ariespirgel, @august-18, Michael Henry (@aviast), Azza Ahmed (@azzaea), Steven Moran (@bambooforest), Mara Averick (@batpigandme), Brent Brewington (@bbrewington), Bill Behrman (@behrman), Ben Herbertson (@benherbertson), Ben Marwick (@benmarwick), Ben Steinberg (@bensteinberg), Benjamin Yeh (@bentyeh), Betul Turkoglu (@betulturkoglu), Brandon Greenwell (@bgreenwell), Brett Klamer (@bklamer), @boardtc, Christian (@c-hoh), Camille V Leonard (@camillevleonard), Christian Mongeau (@chrMongeau), Cooper Morris (@coopermor), Colin Gillespie (@csgillespie), Rademeyer Vermaak (@csrvermaak), Chris Saunders (@ctsa), Abhinav Singh (@curious-abhinav), Curtis Alexander (@curtisalexander), Christian G. Warden (@cwarden), Charlotte Wickham (@cwickham), Kenny Darrell (@darrkj), David Rubinger (@davidrubinger), Derwin McGeary (@derwinmcgeary), Daniel Gromer (@dgromer), @djbirke, Zhuoer Dong (@dongzhuoer), Devin Pastoor (@dpastoor), Julian During (@duju211), Dylan Cashman (@dylancashman), Dirk Eddelbuettel (@eddelbuettel), Ahmed El-Gabbas (@elgabbas), Henry Webel (@enryH), Eric Watt (@ericwatt), Erik Erhardt (@erikerhardt), Etienne B. Racine (@etiennebr), Everett Robinson (@evjrob), @fellennert, Flemming Miguel (@flemmingmiguel), Floris Vanderhaeghe (@florisvdh), @funkybluehen, @gabrivera, Garrick Aden-Buie (@gadenbuie), bahadir cankardes (@gridgrad), Gustav W Delius (@gustavdelius), Hao Chen (@hao-trivago), Harris McGehee (@harrismcgehee), @hendrikweisser, Hengni Cai (@hengnicai), Ian Sealy (@iansealy), Ian Lyttle (@ijlyttle), Ivan Krukov (@ivan-krukov), Jacob Kaplan (@jacobkap), Jazz Weisman (@jazzlw), John Blischak (@jdblischak), John D. Storey (@jdstorey), Jeff Boichuk (@jeffboichuk), Gregory Jefferis (@jefferis), Jennifer (Jenny) Bryan (@jennybc), Jen Ren (@jenren), Jeroen Janssens (@jeroenjanssens), Janet Wesner (@jilmun), Jim Hester (@jimhester), JJ Chen (@jjchern), Jacek Kolacz (@jkolacz), Joanne Jang (@joannejang), John Sears (@johnsears), @jonathanflint, Jon Calder (@jonmcalder), Jonathan Page (@jonpage), JooYoung Seo (@jooyoungseo), Justinas Petuchovas (@jpetuchovas), Jordan (@jrdnbradford), Jeffrey Arnold (@jrnold), Jose Roberto Ayala Solares (@jroberayalas), @juandering, Julia Stewart Lowndes (@jules32), Sonja (@kaetschap), Kara Woo (@karawoo), Katrin Leinweber (@katrinleinweber), Karandeep Singh (@kdpsingh), Kevin Perese (@kevinxperese), Kevin Ferris (@kferris10), Kirill Sevastyanenko (@kirillseva), @koalabearski, Kirill Müller (@krlmlr), Rafał Kucharski (@kucharsky), Noah Landesberg (@landesbergn), Lawrence Wu (@lawwu), @lindbrook, Luke W Johnston (@lwjohnst86), Kunal Marwaha (@marwahaha), Matan Hakim (@matanhakim), Mauro Lepore (@maurolepore), Mark Beveridge (@mbeveridge), @mcewenkhundi, Matt Herman (@mfherman), Michael Boerman (@michaelboerman), Mitsuo Shiota (@mitsuoxv), Matthew Hendrickson (@mjhendrickson), Mohammed Hamdy (@mmhamdy), Maxim Nazarov (@mnazarov), Maria Paula Caldas (@mpaulacaldas), Mustafa Ascha (@mustafaascha), Nelson Areal (@nareal), Nate Olson (@nate-d-olson), Nathanael (@nateaff), @nattalides, Nick Clark (@nickclark1000), @nickelas, Nirmal Patel (@nirmalpatel), Nischal Shrestha (@nischalshrestha), Nicholas Tierney (@njtierney), @olivier6088, Pablo E. Garcia (@pabloedug), Paul Adamson (@padamson), Peter Hurford (@peterhurford), Patrick Kennedy (@pkq), Pooya Taherkhani (@pooyataher), Radu Grosu (@radugrosu), Rayna M Harris (@raynamharris), Robin Gertenbach (@rgertenbach), Riva Quiroga (@rivaquiroga), Richard Zijdeman (@rlzijdeman), @robertchu03, Emily Robinson (@robinsones), Rob Tenorio (@robtenorio), Albert Y. Kim (@rudeboybert), Saghir (@saghirb), Hojjat Salmasian (@salmasian), Jonas (@sauercrowd), Vebash Naidoo (@sciencificity), Seamus McKinsey (@seamus-mckinsey), @seanpwilliams, Luke Smith (@seasmith), Matthew Sedaghatfar (@sedaghatfar), Sebastian Kraus (@sekR4), Sam Firke (@sfirke), @shoili, Sbusiso Mkhondwane (@sibusiso16), Jakob Krigovsky (@sonicdoe), Stéphane Guillou (@stragu), Sergiusz Bleja (@svenski), Tal Galili (@talgalili), Tim Broderick (@timbroderick), Tim Waterhouse (@timwaterhouse), TJ Mahr (@tjmahr), Thomas Klebel (@tklebel), Tom Prior (@tomjamesprior), Terence Teo (@tteo), @twgardner2, Ulrik Lyngs (@ulyngs), Martin Van der Linden (@vanderlindenma), Walter Somerville (@waltersom), Will Beasley (@wibeasley), Yihui Xie (@yihui), Yiming (Paul) Li (@yimingli), Hiroaki Yutani (@yutannihilation), Yu Yu Aung (@yuyu-aung), Zach Bogart (@zachbogart), @zeal626, Zeki Akyol (@zekiakyol).</p>
</section>
<section id="colophon" data-type="sect1">
<h1>
Colophon</h1>
<p>An online version of this book is available at <a href="https://r4ds.hadley.nz" class="uri">https://r4ds.hadley.nz</a>. It will continue to evolve in between reprints of the physical book. The source of the book is available at <a href="https://github.com/hadley/r4ds" class="uri">https://github.com/hadley/r4ds</a>. The book is powered by <a href="https://quarto.org">Quarto</a> which makes it easy to write books that combine text and executable code.</p>
<p>This book was built with:</p>
<div class="cell-output-display">
<table class="table"><colgroup><col style="width: 14%"/><col style="width: 14%"/><col style="width: 71%"/></colgroup><thead><tr class="header"><th style="text-align: left;">package</th>
<th style="text-align: left;">version</th>
<th style="text-align: left;">source</th>
</tr></thead><tbody><tr class="odd"><td style="text-align: left;">broom</td>
<td style="text-align: left;">1.0.1</td>
<td style="text-align: left;">CRAN (R 4.2.0)</td>
</tr><tr class="even"><td style="text-align: left;">cli</td>
<td style="text-align: left;">3.4.1</td>
<td style="text-align: left;">CRAN (R 4.2.1)</td>
</tr><tr class="odd"><td style="text-align: left;">crayon</td>
<td style="text-align: left;">1.5.2</td>
<td style="text-align: left;">CRAN (R 4.2.0)</td>
</tr><tr class="even"><td style="text-align: left;">dbplyr</td>
<td style="text-align: left;">2.2.1.9000</td>
<td style="text-align: left;">Github (tidyverse/dbplyr@f7b5596f6125011ab0dcd4eccbfe56c5294214da)</td>
</tr><tr class="odd"><td style="text-align: left;">dplyr</td>
<td style="text-align: left;">1.0.99.9000</td>
<td style="text-align: left;">local</td>
</tr><tr class="even"><td style="text-align: left;">dtplyr</td>
<td style="text-align: left;">1.2.2</td>
<td style="text-align: left;">CRAN (R 4.2.0)</td>
</tr><tr class="odd"><td style="text-align: left;">forcats</td>
<td style="text-align: left;">0.5.2</td>
<td style="text-align: left;">CRAN (R 4.2.0)</td>
</tr><tr class="even"><td style="text-align: left;">ggplot2</td>
<td style="text-align: left;">3.4.0.9000</td>
<td style="text-align: left;">Github (tidyverse/ggplot2@4fea51b1eb2cdacebeacf425627dcbc1d61a5d3e)</td>
</tr><tr class="odd"><td style="text-align: left;">googledrive</td>
<td style="text-align: left;">2.0.0</td>
<td style="text-align: left;">CRAN (R 4.2.0)</td>
</tr><tr class="even"><td style="text-align: left;">googlesheets4</td>
<td style="text-align: left;">1.0.1</td>
<td style="text-align: left;">CRAN (R 4.2.0)</td>
</tr><tr class="odd"><td style="text-align: left;">haven</td>
<td style="text-align: left;">2.5.1</td>
<td style="text-align: left;">CRAN (R 4.2.0)</td>
</tr><tr class="even"><td style="text-align: left;">hms</td>
<td style="text-align: left;">1.1.2</td>
<td style="text-align: left;">CRAN (R 4.2.0)</td>
</tr><tr class="odd"><td style="text-align: left;">httr</td>
<td style="text-align: left;">1.4.4</td>
<td style="text-align: left;">CRAN (R 4.2.0)</td>
</tr><tr class="even"><td style="text-align: left;">jsonlite</td>
<td style="text-align: left;">1.8.3</td>
<td style="text-align: left;">CRAN (R 4.2.1)</td>
</tr><tr class="odd"><td style="text-align: left;">lubridate</td>
<td style="text-align: left;">1.9.0</td>
<td style="text-align: left;">CRAN (R 4.2.1)</td>
</tr><tr class="even"><td style="text-align: left;">magrittr</td>
<td style="text-align: left;">2.0.3</td>
<td style="text-align: left;">CRAN (R 4.2.0)</td>
</tr><tr class="odd"><td style="text-align: left;">modelr</td>
<td style="text-align: left;">0.1.9</td>
<td style="text-align: left;">CRAN (R 4.2.0)</td>
</tr><tr class="even"><td style="text-align: left;">pillar</td>
<td style="text-align: left;">1.8.1</td>
<td style="text-align: left;">CRAN (R 4.2.0)</td>
</tr><tr class="odd"><td style="text-align: left;">purrr</td>
<td style="text-align: left;">0.9000.0.9000</td>
<td style="text-align: left;">Github (tidyverse/purrr@aaaa58a571cc449dbcc4348e77e589b373e1e059)</td>
</tr><tr class="even"><td style="text-align: left;">readr</td>
<td style="text-align: left;">2.1.3</td>
<td style="text-align: left;">CRAN (R 4.2.1)</td>
</tr><tr class="odd"><td style="text-align: left;">readxl</td>
<td style="text-align: left;">1.4.1</td>
<td style="text-align: left;">CRAN (R 4.2.0)</td>
</tr><tr class="even"><td style="text-align: left;">reprex</td>
<td style="text-align: left;">2.0.2</td>
<td style="text-align: left;">CRAN (R 4.2.0)</td>
</tr><tr class="odd"><td style="text-align: left;">rlang</td>
<td style="text-align: left;">1.0.6</td>
<td style="text-align: left;">CRAN (R 4.2.0)</td>
</tr><tr class="even"><td style="text-align: left;">rstudioapi</td>
<td style="text-align: left;">0.14</td>
<td style="text-align: left;">CRAN (R 4.2.0)</td>
</tr><tr class="odd"><td style="text-align: left;">rvest</td>
<td style="text-align: left;">1.0.3</td>
<td style="text-align: left;">CRAN (R 4.2.0)</td>
</tr><tr class="even"><td style="text-align: left;">stringr</td>
<td style="text-align: left;">1.4.1.9000</td>
<td style="text-align: left;">Github (tidyverse/stringr@ebf38238cbb80bf0e852d5d8d056c04e36d7c20c)</td>
</tr><tr class="odd"><td style="text-align: left;">tibble</td>
<td style="text-align: left;">3.1.8</td>
<td style="text-align: left;">CRAN (R 4.2.0)</td>
</tr><tr class="even"><td style="text-align: left;">tidyr</td>
<td style="text-align: left;">1.2.1.9001</td>
<td style="text-align: left;">Github (tidyverse/tidyr@91747952f10c961be747c0de1026d109c920e4fc)</td>
</tr><tr class="odd"><td style="text-align: left;">tidyverse</td>
<td style="text-align: left;">1.3.2</td>
<td style="text-align: left;">CRAN (R 4.2.0)</td>
</tr><tr class="even"><td style="text-align: left;">xml2</td>
<td style="text-align: left;">1.3.3</td>
<td style="text-align: left;">CRAN (R 4.2.0)</td>
</tr></tbody></table></div>
<div class="cell">
<pre data-type="programlisting" data-code-language="r">cli:::ruler()
#&gt; ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+--
#&gt; 12345678901234567890123456789012345678901234567890123456789012345678901234567</pre>
</div>
</section>
</section>