r4ds/oreilly/program.html

13 lines
3.4 KiB
HTML
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<div data-type="part">
<h1><span id="sec-program-intro" class="quarto-section-identifier d-none d-lg-block">Program</span></h1><p>In this part of the book, youll improve your programming skills. Programming is a cross-cutting skill needed for all data science work: you must use a computer to do data science; you cannot do it in your head, or with pencil and paper.</p><div class="cell">
<div class="cell-output-display">
<figure id="fig-ds-program"><p><img src="diagrams/data-science/program.png" alt="Our model of the data science process with program (import, tidy, transform, visualize, model, and communicate, i.e. everything) highlighted in blue." width="535"/></p>
<figcaption>Figure 1: Programming is the water in which all the other components swim.</figcaption>
</figure>
</div>
</div><p>Programming produces code, and code is a tool of communication. Obviously code tells the computer what you want it to do. But it also communicates meaning to other humans. Thinking about code as a vehicle for communication is important because every project you do is fundamentally collaborative. Even if youre not working with other people, youll definitely be working with future-you! Writing clear code is important so that others (like future-you) can understand why you tackled an analysis in the way you did. That means getting better at programming also involves getting better at communicating. Over time, you want your code to become not just easier to write, but easier for others to read.</p><p>In the following three chapters, youll learn skills to improve your programming skills:</p><ol type="1"><li><p>Copy-and-paste is a powerful tool, but you should avoid doing it more than twice. Repeating yourself in code is dangerous because it can easily lead to errors and inconsistencies. Instead, in <a href="#chp-functions" data-type="xref">#chp-functions</a>, youll learn how to write <strong>functions</strong> which let you extract out repeated code so that it can be easily reused.</p></li>
<li><p>Functions extract out repeated code, but you often need to repeat the same actions on different inputs. You need tools for <strong>iteration</strong> that let you do similar things again and again. These tools include for loops and functional programming, which youll learn about in <a href="#chp-iteration" data-type="xref">#chp-iteration</a>.</p></li>
<li><p>As you read more code written by others, youll see more code that doesnt use the tidyverse. In <a href="#chp-base-R" data-type="xref">#chp-base-R</a>, youll learn some of the most important base R functions that youll see in the wild.</p></li>
</ol><p>The goal of these chapters is to teach you the minimum about programming that you need for data science. Once you have mastered the material here, we strongly recommend that you continue to invest in your programming skills. Weve written two books that you might find helpful. <a href="https://rstudio-education.github.io/hopr/"><em>Hands on Programming with R</em></a>, by Garrett Grolemund, is an introduction to R as a programming language and is a great place to start if R is your first programming language. <a href="https://adv-r.hadley.nz/"><em>Advanced R</em></a> by Hadley Wickham dives into the details of R the programming language; its great place to start if you have existing programming experience and great next step once youve internalized the ideas in these chapters.</p></div>