Update import.Rmd

typos
This commit is contained in:
Radu Grosu 2016-01-27 13:33:35 +00:00
parent 8101753650
commit de44f2e3d6
1 changed files with 73 additions and 73 deletions

View File

@ -45,19 +45,19 @@ There are many ways to read flat files into R. If you've be using R for a while,
sometimes need to supply a few more arguments when using them the first
time, but they'll definitely work on other peoples computers. The base R
functions take a number of settings from system defaults, which means that
code that works on your computer might not work on someone elses.
code that works on your computer might not work on someone else's.
Make sure you have the readr package (`install.packages("readr")`).
Most of readr's functions are concerned with turning flat files into data frames:
* `read_csv()` read comma delimited files, `read_csv2()` reads semi-colon
* `read_csv()` reads comma delimited files, `read_csv2()` reads semi-colon
separated files (common in countries where `,` is used as the decimal place),
`read_tsv()` reads tab delimited files, and `read_delim()` reads in files
with a user supplied delimiter.
* `read_fwf()` reads fixed width files. You can specify fields either by their
widths with `fwf_widths()` or theirs position with `fwf_positions()`.
widths with `fwf_widths()` or their position with `fwf_positions()`.
`read_table()` reads a common variation of fixed width files where columns
are separated by white space.
@ -73,7 +73,7 @@ readr also provides a number of functions for reading files off disk into simple
These might be useful for other programming tasks.
As well as reading data frame disk, readr also provides tools for working with data frames and character vectors in R:
As well as reading data from disk, readr also provides tools for working with data frames and character vectors in R:
* `type_convert()` applies the same parsing heuristics to the character columns
in a data frame. You can override its choices using `col_types`.
@ -94,7 +94,7 @@ The first two arguments of `read_csv()` are:
* `TRUE` (the default), which reads column names from the first row
of the file
* `FALSE` number columns sequentially from `X1` to `Xn`.
* `FALSE` numbers columns sequentially from `X1` to `Xn`.
* A character vector, used as column names. If these don't match up
with the columns in the data, you'll get a warning message.
@ -109,7 +109,7 @@ EXAMPLE
Typically, you'll see a lot of warnings if readr has guessed the column type incorrectly. This most often occurs when the first 1000 rows are different to the rest of the data. Perhaps there are a lot of missing data there, or maybe your data is mostly numeric but a few rows have characters. Fortunately, it's easy to fix these problems using the `col_type` argument.
(Note that if you have a very large file, you might want to set `n_max` to 10,000 or 100,000. That will speed up iteration while you're finding common problems)
(Note that if you have a very large file, you might want to set `n_max` to 10,000 or 100,000. That will speed up iterations while you're finding common problems)
Specifying the `col_type` looks like this:
@ -128,7 +128,7 @@ You can use the following types of columns
* `col_number()` (n) is a more flexible parsed for numbers embedded in other
strings. It will look for the first number in a string, ignoring non-numeric
prefixes and suffixes. It will also ignoring the grouping mark specified by
prefixes and suffixes. It will also ignore the grouping mark specified by
the locale (see below for more details).
* `col_factor()` (f) allows you to load data directly into a factor if you know
@ -139,7 +139,7 @@ You can use the following types of columns
* `col_date()` (D), `col_datetime()` (T) and `col_time()` (t) parse into dates,
date times, and times as described below.
You might have noticed that each column parser has a one letter abbreviation, which you can instead of the full function call (assuming you're happy with the default arguments):
You might have noticed that each column parser has a one letter abbreviation, which you can use instead of the full function call (assuming you're happy with the default arguments):
```{r, eval = FALSE}
read_csv("mypath.csv", col_types = cols(
@ -203,7 +203,7 @@ If these defaults don't work for your data you can supply your own date time for
* AM/PM indicator: `%p`.
* Non-digits: `%.` skips one non-digit charcter, `%*` skips any number of
* Non-digits: `%.` skips one non-digit character, `%*` skips any number of
non-digits.
The best way to figure out the correct string is to create a few examples in a character vector, and test with one of the parsing functions. For example:
@ -360,11 +360,11 @@ There are three key differences between tbl_dfs and data.frames:
You can control the default appearance with options:
* `options(dplyr.print_max = n, dplyr.print_min = m)`: if more than `n`
* `options(dplyr.print_max = n, dplyr.print_min = m)`: if more than `m`
rows print `m` rows. Use `options(dplyr.print_max = Inf)` to always
show all rows.
* `options(dply.width = Inf)` will always print all columns, regardless
* `options(dplyr.width = Inf)` will always print all columns, regardless
of the width of the screen.