r4ds/oreilly/whole-game.html

15 lines
3.4 KiB
HTML
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<div data-type="part">
<h1><span id="sec-whole-game-intro" class="quarto-section-identifier d-none d-lg-block">Whole game</span></h1><p>Our goal in this part of the book is to give you a rapid overview of the main tools of data science: <strong>importing</strong>, <strong>tidying</strong>, <strong>transforming</strong>, and <strong>visualizing data</strong>, as shown in <a href="#fig-ds-whole-game" data-type="xref">#fig-ds-whole-game</a>. We want to show you the “whole game” of data science giving you just enough of all the major pieces so that you can tackle real, if simple, data sets. The later parts of the book, will hit each of these topics in more depth, increasing the range of data science challenges that you can tackle.</p><div class="cell">
<div class="cell-output-display">
<figure id="fig-ds-whole-game"><p><img src="diagrams/data-science/whole-game.png" alt="A diagram displaying the data science cycle: Import -&gt; Tidy -&gt; Understand (which has the phases Transform -&gt; Visualize -&gt; Model in a cycle) -&gt; Communicate. Surrounding all of these is Program Import, Tidy, Transform, and Visualize is highlighted." width="535"/></p>
<figcaption>Figure 1: In this section of the book, youll learn how to import, tidy, transform, and visualize data.</figcaption>
</figure>
</div>
</div><p>Five chapters focus on the tools of data science:</p><ul><li><p>Visualisation is a great place to start with R programming, because the payoff is so clear: you get to make elegant and informative plots that help you understand data. In <a href="#chp-data-visualize" data-type="xref">#chp-data-visualize</a> youll dive into visualization, learning the basic structure of a ggplot2 plot, and powerful techniques for turning data into plots.</p></li>
<li><p>Visualisation alone is typically not enough, so in <a href="#chp-data-transform" data-type="xref">#chp-data-transform</a>, youll learn the key verbs that allow you to select important variables, filter out key observations, create new variables, and compute summaries.</p></li>
<li><p>In <a href="#chp-data-tidy" data-type="xref">#chp-data-tidy</a>, youll learn about tidy data, a consistent way of storing your data that makes transformation, visualization, and modelling easier. Youll learn the underlying principles, and how to get your data into a tidy form.</p></li>
<li><p>Before you can transform and visualize your data, you need to first get your data into R. In <a href="#chp-data-import" data-type="xref">#chp-data-import</a> youll learn the basics of getting <code>.csv</code> files into R.</p></li>
<li><p>Finally, in <a href="#chp-EDA" data-type="xref">#chp-EDA</a>, youll combine visualization and transformation with your curiosity and skepticism to ask and answer interesting questions about data.</p></li>
</ul><p>Nestled among these chapters that are five other chapters that focus on your R workflow. In <a href="#chp-workflow-basics" data-type="xref">#chp-workflow-basics</a>, <a href="#chp-workflow-pipes" data-type="xref">#chp-workflow-pipes</a>, <a href="#chp-workflow-style" data-type="xref">#chp-workflow-style</a>, and <a href="#chp-workflow-scripts" data-type="xref">#chp-workflow-scripts</a>, youll learn good workflow practices for writing and organizing your R code. These will set you up for success in the long run, as theyll give you the tools to stay organised when you tackle real projects.</p></div>