From d67c997d218bb6af844c8e3ffe353edad16db15f Mon Sep 17 00:00:00 2001 From: S'busiso Mkhondwane Date: Sat, 13 Aug 2016 16:02:48 +0200 Subject: [PATCH] Update import.Rmd (#251) Typo --- import.Rmd | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/import.Rmd b/import.Rmd index 62594b1..353a16a 100644 --- a/import.Rmd +++ b/import.Rmd @@ -250,7 +250,7 @@ charToRaw("Hadley") Each hexadecimal number represents a byte of information: `48` is H, `61` is a, and so on. The mapping from hexadecimal number to character is called the encoding, and in this case the encoding is called ASCII. ASCII does a great job of representing English characters, because it's the __American__ Standard Code for Information Interchange. -Things get more complicated for languages other than English. In the early days of computing there were many competing standards for encoding non-English characters, and to correctly interpret a string you need to know both the values and the encoding. For example, two common encodings are Latin1 (aka ISO-8859-1, used for Western European languages) and Latin2 (aka ISO-8859-2, used for Eastern European languages). In Latin1, the byte `b1` is "±", but in Latin2, it's "ą"! Fortunately, today there is one standard that is supported almost everywhere: UTF-8. UTF-8 can encode just about every character used by humans today, as well as many extra symbols (like emoji!). +Things get more complicated for languages other than English. In the early days of computing there were many competing standards for encoding non-English characters, and to correctly interpret a string you needed to know both the values and the encoding. For example, two common encodings are Latin1 (aka ISO-8859-1, used for Western European languages) and Latin2 (aka ISO-8859-2, used for Eastern European languages). In Latin1, the byte `b1` is "±", but in Latin2, it's "ą"! Fortunately, today there is one standard that is supported almost everywhere: UTF-8. UTF-8 can encode just about every character used by humans today, as well as many extra symbols (like emoji!). readr uses UTF-8 everywhere: it assumes your data is UTF-8 encoded when you read it, and always uses it when writing. This is a good default, but will fail for data produced by older systems that don't understand UTF-8. If this happens to you, your strings will look weird when you print them. Sometimes just one or two characters might be messed up; other times you'll get complete gibberish. For example: @@ -340,7 +340,7 @@ Time : `%M` minutes. : `%S` integer seconds. : `%OS` real seconds. -: `%Z` Time zone (as name, e.g. `America/Chicago`). Beware abbreviations: +: `%Z` Time zone (as name, e.g. `America/Chicago`). Beware of abbreviations: if you're American, note that "EST" is a Canadian time zone that does not have daylight savings time. It is \emph{not} Eastern Standard Time! We'll come back to this [time zones]. @@ -628,6 +628,6 @@ To get other types of data into R, we recommend starting with the tidyverse pack __RSQLite__, __RPostgreSQL__ etc) allows you to run SQL queries against a database and return a data frame. -For hierarchical data: use __jsonlite__ (by Jeroen Ooms) for json, and __xml2__ for XML. whichYou will need to convert them to data frames using the tools on [handling hierarchy]. +For hierarchical data: use __jsonlite__ (by Jeroen Ooms) for json, and __xml2__ for XML. You will need to convert them to data frames using the tools on [handling hierarchy]. For other file types, try the [R data import/export manual](https://cran.r-project.org/doc/manuals/r-release/R-data.html) and the [__rio__](https://github.com/leeper/rio) package.