This commit is contained in:
Hadley Wickham 2022-12-06 16:15:54 -06:00
parent ae9680ecd7
commit ae15acf2fc
1 changed files with 2 additions and 1 deletions

View File

@ -9,7 +9,8 @@ status("polishing")
This vignette introduces you to the basics of web scraping with [rvest](https://rvest.tidyverse.org).
Web scraping is a very useful tool for extracting data from web pages.
Some websites will offer an API, a set of structured HTTP requests that return data as JSON, which you handle using the techniques from @sec-rectangling. Where possible, you should use the API, because typically it will give you more reliably data.
Some websites will offer an API, a set of structured HTTP requests that return data as JSON, which you handle using the techniques from @sec-rectangling.
Where possible, you should use the API, because typically it will give you more reliably data.
Unfortunately however, programming with web APIs is out of scope for this book, and we instead teaching scraping, a technique that works whether or not a site provides an API.
In this chapter, we'll first discuss the ethics and legalities of scraping before we dive into the basics of HTML.