Eliminating repeated word (#1380)

This commit is contained in:
alberto-agudo 2023-03-20 22:11:16 +01:00 committed by GitHub
parent 2ff5d75b4b
commit e119132cb4
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 1 additions and 1 deletions

View File

@ -19,7 +19,7 @@ But CSV files aren't very efficient: you have to do quite a lot of work to read
In this chapter, you'll learn about a powerful alternative: the [parquet format](https://parquet.apache.org/), an open standards-based format widely used by big data systems. In this chapter, you'll learn about a powerful alternative: the [parquet format](https://parquet.apache.org/), an open standards-based format widely used by big data systems.
We'll pair parquet files with [Apache Arrow](https://arrow.apache.org), a multi-language toolbox designed for efficient analysis and transport of large datasets. We'll pair parquet files with [Apache Arrow](https://arrow.apache.org), a multi-language toolbox designed for efficient analysis and transport of large datasets.
We'll use Apache Arrow via the the [arrow package](https://arrow.apache.org/docs/r/), which provides a dplyr backend allowing you to analyze larger-than-memory datasets using familiar dplyr syntax. We'll use Apache Arrow via the [arrow package](https://arrow.apache.org/docs/r/), which provides a dplyr backend allowing you to analyze larger-than-memory datasets using familiar dplyr syntax.
As an additional benefit, arrow is extremely fast: you'll see some examples later in the chapter. As an additional benefit, arrow is extremely fast: you'll see some examples later in the chapter.
Both arrow and dbplyr provide dplyr backends, so you might wonder when to use each. Both arrow and dbplyr provide dplyr backends, so you might wonder when to use each.