40 lines
3.7 KiB
Plaintext
40 lines
3.7 KiB
Plaintext
# Programming
|
|
|
|
Code is a tool of communication, not just to the computer, but to other people. This is important because every project you undertake is fundamentally collaborative. Even if you're not working with other people, you'll definitely be working with future-you. You want to write clear code so that future-you doesn't curse present-you when you look at a project again after several months have passed.
|
|
|
|
To me, improving your communication skills is a key part of mastering R as a programming language. Over time, you want your code to become more and more clear, and easier to write. In this part of the book, you'll learn three important skills that help you move in this direction:
|
|
|
|
1. We'll dive deep into the __pipe__, `%>%`, talking more about how it works
|
|
and how it gives you a new tool for rewriting your code. You'll also learn
|
|
about when not to use the pipe!
|
|
|
|
1. Repeating yourself in code is dangerous because it can easily lead to
|
|
errors and inconsistencies. We'll talk about how to write __functions__
|
|
in order to remove duplication in your logic.
|
|
|
|
1. Another important tool for removing duplication is the __for loop__ which
|
|
allows you to repeat the same action again and again and again. You tend to
|
|
use for-loops less often in R than in other programming languages because R
|
|
is a functional programming language which means that you can extract out
|
|
common patterns of for loops and put them in a function. We'll come back to
|
|
that idea in XYZ.
|
|
|
|
Removing duplication is an important part of expressing yourself clearly because it lets the reader (i.e. future-you!) focus on what's different between operations rather than what's the same. The goal is not just to write better functions or to do things that you couldn't do before, but to code with more "ease". As you internalise the ideas in this chapter, you should find it easier to re-tackle problems that you've solved in the past with much effort.
|
|
|
|
Writing code is similar in many ways to writing prose. One parallel which I find particularly useful is that in both cases rewriting is key to clarity. The first expression of your ideas is unlikely to be particularly clear, and you may need to rewrite multiple times. After solving a data analysis challenge, it's often worth looking at your code and thinking about whether or not it's obvious what you've done. If you spend a little time rewriting your code while the ideas are fresh, you can save a lot of time later trying to recreate what your code did. But this doesn't mean you should rewrite every function: you need to balance what you need to achieve now with saving time in the long run. (But the more you rewrite your functions the more likely you'll first attempt will be clear.)
|
|
|
|
## Learning more
|
|
|
|
As you become a better R programmer, you'll learn more techniques for reducing various types of duplication. This allows you to do more with less, and allows you to express yourself more clearly by taking advantage of powerful programming constructs.
|
|
|
|
To learn more you need to study R as a programming language, not just an interactive environment for data science. We have written two books that will help you do so:
|
|
|
|
* [Hands on programming with R](http://shop.oreilly.com/product/0636920028574.do),
|
|
by Garrett Grolemund. This is an introduction to R as a programming language
|
|
and is a great place to start if R is your first programming language.
|
|
|
|
* [Advanced R](http://adv-r.had.co.nz) by Hadley Wickham. This dives into the
|
|
details of R the programming language. This is a great place to start if
|
|
you've programmed in other languages and you want to learn what makes R
|
|
special, different, and particularly well suited to data analysis.
|