r4ds/logicals.Rmd

# Logicals and numbers {#logicals}

```{r, results = "asis", echo = FALSE}
status("drafting")
```

## Introduction

In this chapter, you'll learn useful tools for working with logical vectors.
Logical vectors are the simplest type of vector because each element can only be one of three possible values: `TRUE`, `FALSE`, and `NA`.
Despite that simplicity, they're an extremely powerful tool.

### Prerequisites

```{r, message = FALSE}
library(tidyverse)
library(nycflights13)
```

## Comparisons

Some times you'll get data that already includes logical vectors but in most cases you'll create them by using a comparison, like `<`, `<=`, `>`, `>=`, `!=`, and `==`.

### In `mutate()`

So far, you've mostly created these new variables implicitly within `filter()`:

```{r}
flights |> 
  filter(dep_time > 600 & dep_time < 2000 & abs(arr_delay) < 20)
```

But it's useful to know that this is a shortcut and you can explicitly create perform these operations inside a `mutate()`

```{r}
flights |> 
  mutate(
    daytime = dep_time > 600 & dep_time < 2000,
    approx_ontime = abs(arr_delay) < 20,
    .keep = "used"
  )
```

So the filter above could also be written as:

```{r}
flights |> 
  mutate(
    daytime = dep_time > 600 & dep_time < 2000,
    approx_ontime = abs(arr_delay) < 20,
  ) |> 
  filter(daytime & approx_ontime)
```

This is an important technique when you're are doing complicated subsetting because it allows you to double-check the intermediate steps.

### Floating point comparison

Beware when using `==` with numbers as results might surprise you!
You might think that the following two computations yield 1 and 2:

```{r}
(1 / 49 * 49)
sqrt(2) ^ 2
```

But if you test them for equality, you'll discover that they're not what you expect!

```{r}
(1 / 49 * 49) == 1
(sqrt(2) ^ 2) == 2
```

That's because computers use finite precision arithmetic (they obviously can't store an infinite number of digits!) so in most cases, the number number you see is an actually approximation.
R usually rounds these numbers to avoid displaying a bunch of usually unimportant digits.
You can use the `digits` argument to `format()` to force R to display more:

```{r}
format(1 / 49 * 49, digits = 20)
format(sqrt(2) ^ 2, digits = 20)
```

Instead of relying on `==`, you can use `dplyr::near()`, which does the comparison with a small amount of tolerance:

```{r}
near(sqrt(2) ^ 2,  2)
near(1 / 49 * 49, 1)
```

### `is.na()`

Another common way to create logical vector is with `is.na()`.
This is particularly important in conjunction with `filter()` because filter only selects rows where the value is `TRUE`; rows where the value is `FALSE` are automatically dropped.

```{r}
flights |> filter(is.na(dep_delay) | is.na(arr_delay))
flights |> filter(is.na(dep_delay) != is.na(arr_delay))
```

## Boolean algebra

Once you have multiple logical vectors, you can combine them together using Boolean algebra: `&` is "and", `|` is "or", and `!` is "not".
`xor()` provides one final useful operation: exclusive or.
Figure \@ref(fig:bool-ops) shows the complete set of Boolean operations and how they work.

```{r bool-ops}
#| echo: false
#| out.width: NULL
#| fig.cap: > 
#|    Complete set of boolean operations. `x` is the left-hand
#|    circle, `y` is the right-hand circle, and the shaded region show 
#|    which parts each operator selects."
#| fig.alt: >
#|    Six Venn diagrams, each explaining a given logical operator. The
#|    circles (sets) in each of the Venn diagrams represent x and y. 1. y &
#|    !x is y but none of x, x & y is the intersection of x and y, x & !y is
#|    x but none of y, x is all of x none of y, xor(x, y) is everything
#|    except the intersection of x and y, y is all of y none of x, and 
#|    x | y is everything.
knitr::include_graphics("diagrams/transform-logical.png")
```

As well as `&` and `|`, R also has `&&` and `||`.
Don't use them in dplyr functions!
These are called short-circuiting operators and only ever return a single `TRUE` or `FALSE`.
They're important for programming so you'll learn more about them in Section \@ref(conditional-execution).

The following code finds all flights that departed in November or December:

```{r, eval = FALSE}
flights |> 
   filter(month == 11 | month == 12)
```

Note that the order of operations doesn't work like English.
You can't think "find all flights that departed in November or December" and write `flights |> filter(month == 11 | 12)`.
This code will not error, but it will do something rather confusing.
First R evaluates `11 | 12` which is equivalent to `TRUE | TRUE`, which returns `TRUE`.
Then it evaluates `month == TRUE`.
Since month is numeric, this is equivalent to `month == 1`, so `flights |> filter(month == 11 | 12)` returns all flights in January!

### `%in%`

An easy way to avoid this issue is to use `%in%`.
`x %in% y` returns a logical vector the same length as `x` that is `TRUE` whenever a value in `x` is anywhere in `y` .
So we could instead write:

```{r, eval = FALSE}
flights |> 
  filter(month %in% c(11, 12))
```

Sometimes you can simplify complicated subsetting by remembering De Morgan's law: `!(x & y)` is the same as `!x | !y`, and `!(x | y)` is the same as `!x & !y`.
For example, if you wanted to find flights that weren't delayed (on arrival or departure) by more than two hours, you could use either of the following two filters:

```{r, eval = FALSE}
flights |> 
  filter(!(arr_delay > 120 | dep_delay > 120))
flights |> 
  filter(arr_delay <= 120 & dep_delay <= 120)
```

### Missing values {#logical-missing}

The rules for missing values in Boolean algebra are a little tricky to explain because they seem inconsistent at first glance:

```{r}
NA & c(TRUE, FALSE, NA)
NA | c(TRUE, FALSE, NA)
```

<!-- Draw truth tables? -->

To understand what's going on you need to think about `x | TRUE`, because regardless of whether `x` is `TRUE` or `FALSE` the result is still `TRUE`.
That means even if you don't know what `x` is (i.e. it's missing), the result must still be `TRUE`.

## Summaries

There are four particularly useful summary functions for logical vectors: they all take a vector of logical values and return a single value, making them a good fit for use in `summarise()`.

`any()` and `all()` --- `any()` will return if there's at least one `TRUE`, `all()` will return `TRUE` if all values are `TRUE`.
Like all summary functions, they'll return `NA` if there are any missing values present, and like usual you can make the missing values go away with `na.rm = TRUE`.
We could use this to see if there were any days where every flight was delayed:

```{r}
not_cancelled <- flights |> filter(!is.na(dep_delay), !is.na(arr_delay))

not_cancelled |> 
  group_by(year, month, day) |> 
  filter(all(arr_delay >= 0))
```

`sum()` and `mean()` are particularly useful with logical vectors because when you use a logical vector in a numeric context, `TRUE` becomes 1 and `FALSE` becomes 0.
That means that `sum(x)` gives the number of `TRUE`s in `x` and `mean(x)` gives the proportion of `TRUE`s.
That lets us find the day with the highest proportion of delayed flights:

```{r}
not_cancelled |> 
  group_by(year, month, day) |> 
  summarise(prop_delayed = mean(arr_delay > 0)) |> 
  arrange(desc(prop_delayed))

```

Or we could ask how many flights left before 5am, which usually are flights that were delayed from the previous day:

```{r}
not_cancelled |> 
  group_by(year, month, day) |> 
  summarise(n_early = sum(dep_time < 500)) |> 
  arrange(desc(n_early))
```

### Exercises

1.  For each plane, count the number of flights before the first delay of greater than 1 hour.
2.  What does `prod()` return when applied to a logical vector? What logical summary function is it equivalent to? What does `min()` return applied to a logical vector? What logical summary function is it equivalent to?

## Transformations

### Cumulative functions

Another useful pair of functions are cumulative any, `cumany()`, and cumulative all, `cumall()`.
`cumany()` will be `TRUE` after it encounters the first `TRUE`, and `cumall()` will be `FALSE` after it encounters its first `FALSE`.

```{r}
cumany(c(FALSE, FALSE, TRUE, TRUE, FALSE, TRUE))
cumall(c(TRUE, FALSE, TRUE, TRUE, FALSE, TRUE))
```

These are particularly useful in conjunction with `filter()` because they allow you to select rows:

-   Before the first `FALSE` with `cumall(x)`.
-   Before the first `TRUE` with `cumall(!x)`.
-   After the first `TRUE` with `cumany(x)`.
-   After the first `FALSE` with `cumany(!x)`.

If you imagine some data about a bank balance, then these functions allow you t

```{r}
df <- data.frame(
  date = as.Date("2020-01-01") + 0:6,
  balance = c(100, 50, 25, -25, -50, 30, 120)
)
# all rows after first overdraft
df |> filter(cumany(balance < 0))
# all rows until first overdraft
df |> filter(cumall(!(balance < 0)))
```

### Conditional outputs

If you want to use one value when a condition is true and another value when it's `FALSE`, you can use `if_else()`[^logicals-1].

[^logicals-1]: This is equivalent to the base R function `ifelse`.
    There are two main advantages of `if_else()`over `ifelse()`: you can choose what should happen to missing values, and `if_else()` is much more likely to give you a meaningful error message if you use the wrong type of variable.

```{r}
df <- data.frame(
  date = as.Date("2020-01-01") + 0:6,
  balance = c(100, 50, 25, -25, -50, 30, 120)
)
df |> mutate(status = if_else(balance < 0, "overdraft", "ok"))
```

If you start to nest multiple sets of `if_else`s, I'd suggest switching to `case_when()` instead.
`case_when()` has a special syntax: it takes pairs that look like `condition ~ output`.
`condition` must evaluate to a logical vector; when it's `TRUE`, output will be used.

```{r}
df |> 
  mutate(
    status = case_when(
      balance == 0 ~ "no money", 
      balance  < 0 ~ "overdraft",
      balance  > 0 ~ "ok"
    )
  )
```

(Note that I usually add spaces to make the outputs line up so it's easier to scan)

If none of the cases match, the output will be missing:

```{r}
x <- 1:10
case_when(
  x %% 2 == 0 ~ "even",
)
```

You can create a catch all value by using `TRUE` as the condition:

```{r}
case_when(
  x %% 2 == 0 ~ "even",
  TRUE        ~ "odd"
)
```

If multiple conditions are `TRUE`, the first is used:

```{r}
case_when(
  x < 5 ~ "< 5",
  x < 3 ~ "< 3",
)
```

## 

##
More on logical + numbers 2022-03-18 03:15:24 +08:00			`# Logicals and numbers {#logicals}`
Second crack and 2e structure 2021-03-04 01:13:14 +08:00
Add chapter status 2021-05-04 21:10:39 +08:00			```{r, results = "asis", echo = FALSE}
			`status("drafting")`
			```

Second crack and 2e structure 2021-03-04 01:13:14 +08:00			`## Introduction`
Break up data-transform content 2021-04-19 20:56:29 +08:00
Some vector chapter reorganisation 2022-03-17 22:46:35 +08:00			`In this chapter, you'll learn useful tools for working with logical vectors.`
More on logical + numbers 2022-03-18 03:15:24 +08:00			Logical vectors are the simplest type of vector because each element can only be one of three possible values: `TRUE`, `FALSE`, and `NA`.
			`Despite that simplicity, they're an extremely powerful tool.`
Break up data-transform content 2021-04-19 20:56:29 +08:00
Hacking away at logicals/numerics 2022-02-05 02:27:20 +08:00			`### Prerequisites`

			```{r, message = FALSE}
Get code working again 2021-04-19 22:31:38 +08:00			`library(tidyverse)`
			`library(nycflights13)`
			```

Some vector chapter reorganisation 2022-03-17 22:46:35 +08:00			`## Comparisons`
Hacking away at logicals/numerics 2022-02-05 02:27:20 +08:00
More on logical + numbers 2022-03-18 03:15:24 +08:00			Some times you'll get data that already includes logical vectors but in most cases you'll create them by using a comparison, like `<`, `<=`, `>`, `>=`, `!=`, and `==`.
Some vector chapter reorganisation 2022-03-17 22:46:35 +08:00
More on logical + numbers 2022-03-18 03:15:24 +08:00			### In `mutate()`
Some vector chapter reorganisation 2022-03-17 22:46:35 +08:00
More on logical + numbers 2022-03-18 03:15:24 +08:00			So far, you've mostly created these new variables implicitly within `filter()`:

			```{r}
			`flights \|>`
			`filter(dep_time > 600 & dep_time < 2000 & abs(arr_delay) < 20)`
			```

			But it's useful to know that this is a shortcut and you can explicitly create perform these operations inside a `mutate()`

			```{r}
			`flights \|>`
			`mutate(`
			`daytime = dep_time > 600 & dep_time < 2000,`
			`approx_ontime = abs(arr_delay) < 20,`
			`.keep = "used"`
			`)`
			```

			`So the filter above could also be written as:`

			```{r}
			`flights \|>`
			`mutate(`
			`daytime = dep_time > 600 & dep_time < 2000,`
			`approx_ontime = abs(arr_delay) < 20,`
			`) \|>`
			`filter(daytime & approx_ontime)`
			```

			`This is an important technique when you're are doing complicated subsetting because it allows you to double-check the intermediate steps.`

			`### Floating point comparison`
Some vector chapter reorganisation 2022-03-17 22:46:35 +08:00
			Beware when using `==` with numbers as results might surprise you!
More on logical + numbers 2022-03-18 03:15:24 +08:00			`You might think that the following two computations yield 1 and 2:`

			```{r}
			`(1 / 49 * 49)`
			`sqrt(2) ^ 2`
			```

			`But if you test them for equality, you'll discover that they're not what you expect!`
Some vector chapter reorganisation 2022-03-17 22:46:35 +08:00
			```{r}
			`(1 / 49 * 49) == 1`
More on logical + numbers 2022-03-18 03:15:24 +08:00			`(sqrt(2) ^ 2) == 2`
Some vector chapter reorganisation 2022-03-17 22:46:35 +08:00			```

More on logical + numbers 2022-03-18 03:15:24 +08:00			`That's because computers use finite precision arithmetic (they obviously can't store an infinite number of digits!) so in most cases, the number number you see is an actually approximation.`
			`R usually rounds these numbers to avoid displaying a bunch of usually unimportant digits.`
			You can use the `digits` argument to `format()` to force R to display more:
Some vector chapter reorganisation 2022-03-17 22:46:35 +08:00
			```{r}
More on logical + numbers 2022-03-18 03:15:24 +08:00			`format(1 / 49 * 49, digits = 20)`
			`format(sqrt(2) ^ 2, digits = 20)`
Some vector chapter reorganisation 2022-03-17 22:46:35 +08:00			```

More on logical + numbers 2022-03-18 03:15:24 +08:00			Instead of relying on `==`, you can use `dplyr::near()`, which does the comparison with a small amount of tolerance:
Some vector chapter reorganisation 2022-03-17 22:46:35 +08:00
			```{r}
			`near(sqrt(2) ^ 2, 2)`
			`near(1 / 49 * 49, 1)`
			```
Break up data-transform content 2021-04-19 20:56:29 +08:00
More on logical + numbers 2022-03-18 03:15:24 +08:00			### `is.na()`

			Another common way to create logical vector is with `is.na()`.
			This is particularly important in conjunction with `filter()` because filter only selects rows where the value is `TRUE`; rows where the value is `FALSE` are automatically dropped.

			```{r}
			`flights \|> filter(is.na(dep_delay) \| is.na(arr_delay))`
			`flights \|> filter(is.na(dep_delay) != is.na(arr_delay))`
			```
Some vector chapter reorganisation 2022-03-17 22:46:35 +08:00
			`## Boolean algebra`
Hacking away at logicals/numerics 2022-02-05 02:27:20 +08:00
More on logical + numbers 2022-03-18 03:15:24 +08:00			Once you have multiple logical vectors, you can combine them together using Boolean algebra: `&` is "and", `\|` is "or", and `!` is "not".
			`xor()` provides one final useful operation: exclusive or.
			`Figure \@ref(fig:bool-ops) shows the complete set of Boolean operations and how they work.`
Break up data-transform content 2021-04-19 20:56:29 +08:00
Hacking away at logicals/numerics 2022-02-05 02:27:20 +08:00			```{r bool-ops}
			`#\| echo: false`
More on logical + numbers 2022-03-18 03:15:24 +08:00			`#\| out.width: NULL`
Hacking away at logicals/numerics 2022-02-05 02:27:20 +08:00			`#\| fig.cap: >`
			#\| Complete set of boolean operations. `x` is the left-hand
			#\| circle, `y` is the right-hand circle, and the shaded region show
			`#\| which parts each operator selects."`
			`#\| fig.alt: >`
			`#\| Six Venn diagrams, each explaining a given logical operator. The`
			`#\| circles (sets) in each of the Venn diagrams represent x and y. 1. y &`
			`#\| !x is y but none of x, x & y is the intersection of x and y, x & !y is`
			`#\| x but none of y, x is all of x none of y, xor(x, y) is everything`
			`#\| except the intersection of x and y, y is all of y none of x, and`
			`#\| x \| y is everything.`
Break up data-transform content 2021-04-19 20:56:29 +08:00			`knitr::include_graphics("diagrams/transform-logical.png")`
			```

More on logical + numbers 2022-03-18 03:15:24 +08:00			As well as `&` and `\|`, R also has `&&` and `\|\|`.
			`Don't use them in dplyr functions!`
			These are called short-circuiting operators and only ever return a single `TRUE` or `FALSE`.
			`They're important for programming so you'll learn more about them in Section \@ref(conditional-execution).`

Break up data-transform content 2021-04-19 20:56:29 +08:00			`The following code finds all flights that departed in November or December:`

			```{r, eval = FALSE}
More on logical + numbers 2022-03-18 03:15:24 +08:00			`flights \|>`
			`filter(month == 11 \| month == 12)`
Break up data-transform content 2021-04-19 20:56:29 +08:00			```

Hacking away at logicals/numerics 2022-02-05 02:27:20 +08:00			`Note that the order of operations doesn't work like English.`
More on logical + numbers 2022-03-18 03:15:24 +08:00			You can't think "find all flights that departed in November or December" and write `flights \|> filter(month == 11 \| 12)`.
			`This code will not error, but it will do something rather confusing.`
			First R evaluates `11 \| 12` which is equivalent to `TRUE \| TRUE`, which returns `TRUE`.
Hacking away at logicals/numerics 2022-02-05 02:27:20 +08:00			Then it evaluates `month == TRUE`.
More on logical + numbers 2022-03-18 03:15:24 +08:00			Since month is numeric, this is equivalent to `month == 1`, so `flights \|> filter(month == 11 \| 12)` returns all flights in January!

			### `%in%`
Break up data-transform content 2021-04-19 20:56:29 +08:00
More on logical + numbers 2022-03-18 03:15:24 +08:00			An easy way to avoid this issue is to use `%in%`.
Hacking away at logicals/numerics 2022-02-05 02:27:20 +08:00			`x %in% y` returns a logical vector the same length as `x` that is `TRUE` whenever a value in `x` is anywhere in `y` .
More on logical + numbers 2022-03-18 03:15:24 +08:00			`So we could instead write:`
Break up data-transform content 2021-04-19 20:56:29 +08:00
			```{r, eval = FALSE}
More on logical + numbers 2022-03-18 03:15:24 +08:00			`flights \|>`
			`filter(month %in% c(11, 12))`
Break up data-transform content 2021-04-19 20:56:29 +08:00			```

			Sometimes you can simplify complicated subsetting by remembering De Morgan's law: `!(x & y)` is the same as `!x \| !y`, and `!(x \| y)` is the same as `!x & !y`.
			`For example, if you wanted to find flights that weren't delayed (on arrival or departure) by more than two hours, you could use either of the following two filters:`

			```{r, eval = FALSE}
More on logical + numbers 2022-03-18 03:15:24 +08:00			`flights \|>`
			`filter(!(arr_delay > 120 \| dep_delay > 120))`
			`flights \|>`
			`filter(arr_delay <= 120 & dep_delay <= 120)`
Break up data-transform content 2021-04-19 20:56:29 +08:00			```

More on logical + numbers 2022-03-18 03:15:24 +08:00			`### Missing values {#logical-missing}`

			`The rules for missing values in Boolean algebra are a little tricky to explain because they seem inconsistent at first glance:`

			```{r}
			`NA & c(TRUE, FALSE, NA)`
			`NA \| c(TRUE, FALSE, NA)`
			```
Break up data-transform content 2021-04-19 20:56:29 +08:00
More on logical + numbers 2022-03-18 03:15:24 +08:00			`<!-- Draw truth tables? -->`
Break up data-transform content 2021-04-19 20:56:29 +08:00
More on logical + numbers 2022-03-18 03:15:24 +08:00			To understand what's going on you need to think about `x \| TRUE`, because regardless of whether `x` is `TRUE` or `FALSE` the result is still `TRUE`.
			That means even if you don't know what `x` is (i.e. it's missing), the result must still be `TRUE`.

			`## Summaries`

			There are four particularly useful summary functions for logical vectors: they all take a vector of logical values and return a single value, making them a good fit for use in `summarise()`.

			`any()` and `all()` --- `any()` will return if there's at least one `TRUE`, `all()` will return `TRUE` if all values are `TRUE`.
			Like all summary functions, they'll return `NA` if there are any missing values present, and like usual you can make the missing values go away with `na.rm = TRUE`.
			`We could use this to see if there were any days where every flight was delayed:`
Break up data-transform content 2021-04-19 20:56:29 +08:00
Hacking away at logicals/numerics 2022-02-05 02:27:20 +08:00			```{r}
More on logical + numbers 2022-03-18 03:15:24 +08:00			`not_cancelled <- flights \|> filter(!is.na(dep_delay), !is.na(arr_delay))`

			`not_cancelled \|>`
			`group_by(year, month, day) \|>`
			`filter(all(arr_delay >= 0))`
Hacking away at logicals/numerics 2022-02-05 02:27:20 +08:00			```
Break up data-transform content 2021-04-19 20:56:29 +08:00
More on logical + numbers 2022-03-18 03:15:24 +08:00			`sum()` and `mean()` are particularly useful with logical vectors because when you use a logical vector in a numeric context, `TRUE` becomes 1 and `FALSE` becomes 0.
			That means that `sum(x)` gives the number of `TRUE`s in `x` and `mean(x)` gives the proportion of `TRUE`s.
			`That lets us find the day with the highest proportion of delayed flights:`

			```{r}
			`not_cancelled \|>`
			`group_by(year, month, day) \|>`
			`summarise(prop_delayed = mean(arr_delay > 0)) \|>`
			`arrange(desc(prop_delayed))`

			```
Get code working again 2021-04-19 22:31:38 +08:00
More on logical + numbers 2022-03-18 03:15:24 +08:00			`Or we could ask how many flights left before 5am, which usually are flights that were delayed from the previous day:`
Break up data-transform content 2021-04-19 20:56:29 +08:00
Hacking away at logicals/numerics 2022-02-05 02:27:20 +08:00			```{r}
More on logical + numbers 2022-03-18 03:15:24 +08:00			`not_cancelled \|>`
			`group_by(year, month, day) \|>`
			`summarise(n_early = sum(dep_time < 500)) \|>`
			`arrange(desc(n_early))`
Hacking away at logicals/numerics 2022-02-05 02:27:20 +08:00			```

More on logical + numbers 2022-03-18 03:15:24 +08:00			`### Exercises`

			`1. For each plane, count the number of flights before the first delay of greater than 1 hour.`
			2. What does `prod()` return when applied to a logical vector? What logical summary function is it equivalent to? What does `min()` return applied to a logical vector? What logical summary function is it equivalent to?

			`## Transformations`

			`### Cumulative functions`
Some vector chapter reorganisation 2022-03-17 22:46:35 +08:00
			Another useful pair of functions are cumulative any, `cumany()`, and cumulative all, `cumall()`.
			`cumany()` will be `TRUE` after it encounters the first `TRUE`, and `cumall()` will be `FALSE` after it encounters its first `FALSE`.

More on logical + numbers 2022-03-18 03:15:24 +08:00			```{r}
			`cumany(c(FALSE, FALSE, TRUE, TRUE, FALSE, TRUE))`
			`cumall(c(TRUE, FALSE, TRUE, TRUE, FALSE, TRUE))`
			```

			These are particularly useful in conjunction with `filter()` because they allow you to select rows:

			- Before the first `FALSE` with `cumall(x)`.
			- Before the first `TRUE` with `cumall(!x)`.
			- After the first `TRUE` with `cumany(x)`.
			- After the first `FALSE` with `cumany(!x)`.

			`If you imagine some data about a bank balance, then these functions allow you t`
Some vector chapter reorganisation 2022-03-17 22:46:35 +08:00
			```{r}
			`df <- data.frame(`
			`date = as.Date("2020-01-01") + 0:6,`
			`balance = c(100, 50, 25, -25, -50, 30, 120)`
			`)`
			`# all rows after first overdraft`
			`df \|> filter(cumany(balance < 0))`
			`# all rows until first overdraft`
			`df \|> filter(cumall(!(balance < 0)))`
			```

More on logical + numbers 2022-03-18 03:15:24 +08:00			`### Conditional outputs`
Hacking away at logicals/numerics 2022-02-05 02:27:20 +08:00
More on logical + numbers 2022-03-18 03:15:24 +08:00			If you want to use one value when a condition is true and another value when it's `FALSE`, you can use `if_else()`[^logicals-1].
Hacking away at logicals/numerics 2022-02-05 02:27:20 +08:00
More on logical + numbers 2022-03-18 03:15:24 +08:00			[^logicals-1]: This is equivalent to the base R function `ifelse`.
Hacking away at logicals/numerics 2022-02-05 02:27:20 +08:00			There are two main advantages of `if_else()`over `ifelse()`: you can choose what should happen to missing values, and `if_else()` is much more likely to give you a meaningful error message if you use the wrong type of variable.

			```{r}
			`df <- data.frame(`
			`date = as.Date("2020-01-01") + 0:6,`
			`balance = c(100, 50, 25, -25, -50, 30, 120)`
			`)`
Convert from %>% to \|> 2022-02-24 03:15:52 +08:00			`df \|> mutate(status = if_else(balance < 0, "overdraft", "ok"))`
Hacking away at logicals/numerics 2022-02-05 02:27:20 +08:00			```

			If you start to nest multiple sets of `if_else`s, I'd suggest switching to `case_when()` instead.
			`case_when()` has a special syntax: it takes pairs that look like `condition ~ output`.
			`condition` must evaluate to a logical vector; when it's `TRUE`, output will be used.
Break up data-transform content 2021-04-19 20:56:29 +08:00
Hacking away at logicals/numerics 2022-02-05 02:27:20 +08:00			```{r}
Convert from %>% to \|> 2022-02-24 03:15:52 +08:00			`df \|>`
Hacking away at logicals/numerics 2022-02-05 02:27:20 +08:00			`mutate(`
			`status = case_when(`
			`balance == 0 ~ "no money",`
			`balance < 0 ~ "overdraft",`
			`balance > 0 ~ "ok"`
			`)`
			`)`
			```

			`(Note that I usually add spaces to make the outputs line up so it's easier to scan)`

			`If none of the cases match, the output will be missing:`

			```{r}
			`x <- 1:10`
			`case_when(`
			`x %% 2 == 0 ~ "even",`
			`)`
			```

			You can create a catch all value by using `TRUE` as the condition:

			```{r}
			`case_when(`
			`x %% 2 == 0 ~ "even",`
			`TRUE ~ "odd"`
			`)`
			```

			If multiple conditions are `TRUE`, the first is used:

			```{r}
			`case_when(`
			`x < 5 ~ "< 5",`
			`x < 3 ~ "< 3",`
			`)`
			```

More on logical + numbers 2022-03-18 03:15:24 +08:00			`##`
Hacking away at logicals/numerics 2022-02-05 02:27:20 +08:00
Some vector chapter reorganisation 2022-03-17 22:46:35 +08:00			`##`