From e119132cb478a104163a6c0de59327e849fec9b7 Mon Sep 17 00:00:00 2001 From: alberto-agudo <91462184+alberto-agudo@users.noreply.github.com> Date: Mon, 20 Mar 2023 22:11:16 +0100 Subject: [PATCH] Eliminating repeated word (#1380) --- arrow.qmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arrow.qmd b/arrow.qmd index 0226ca9..1cff1c1 100644 --- a/arrow.qmd +++ b/arrow.qmd @@ -19,7 +19,7 @@ But CSV files aren't very efficient: you have to do quite a lot of work to read In this chapter, you'll learn about a powerful alternative: the [parquet format](https://parquet.apache.org/), an open standards-based format widely used by big data systems. We'll pair parquet files with [Apache Arrow](https://arrow.apache.org), a multi-language toolbox designed for efficient analysis and transport of large datasets. -We'll use Apache Arrow via the the [arrow package](https://arrow.apache.org/docs/r/), which provides a dplyr backend allowing you to analyze larger-than-memory datasets using familiar dplyr syntax. +We'll use Apache Arrow via the [arrow package](https://arrow.apache.org/docs/r/), which provides a dplyr backend allowing you to analyze larger-than-memory datasets using familiar dplyr syntax. As an additional benefit, arrow is extremely fast: you'll see some examples later in the chapter. Both arrow and dbplyr provide dplyr backends, so you might wonder when to use each.