common is now words

This commit is contained in:
hadley 2016-07-20 11:51:53 -05:00
parent fa8940ecab
commit e56032efe3
1 changed files with 7 additions and 7 deletions

View File

@ -418,18 +418,18 @@ Remember that when you use a logical vector in a numeric context, `FALSE` become
```{r} ```{r}
# How many common words start with t? # How many common words start with t?
sum(str_detect(common, "^t")) sum(str_detect(words, "^t"))
# What proportion of common words end with a vowel? # What proportion of common words end with a vowel?
mean(str_detect(common, "[aeiou]$")) mean(str_detect(words, "[aeiou]$"))
``` ```
When you have complex logical conditions (e.g. match a or b but not c unless d) it's often easier to combine multiple `str_detect()` calls with logical operators, rather than trying to create a single regular expression. For example, here are two ways to find all words that don't contain any vowels: When you have complex logical conditions (e.g. match a or b but not c unless d) it's often easier to combine multiple `str_detect()` calls with logical operators, rather than trying to create a single regular expression. For example, here are two ways to find all words that don't contain any vowels:
```{r} ```{r}
# Find all words containing at least one vowel, and negate # Find all words containing at least one vowel, and negate
no_vowels_1 <- !str_detect(common, "[aeiou]") no_vowels_1 <- !str_detect(words, "[aeiou]")
# Find all words consisting only of consonants (non-vowels) # Find all words consisting only of consonants (non-vowels)
no_vowels_2 <- str_detect(common, "^[^aeiou]+$") no_vowels_2 <- str_detect(words, "^[^aeiou]+$")
all.equal(no_vowels_1, no_vowels_2) all.equal(no_vowels_1, no_vowels_2)
``` ```
@ -438,8 +438,8 @@ The results are identical, but I think the first approach is significantly easie
A common use of `str_detect()` is to select the elements that match a pattern. You can do this with logical subsetting, or the convenient `str_subset()` wrapper: A common use of `str_detect()` is to select the elements that match a pattern. You can do this with logical subsetting, or the convenient `str_subset()` wrapper:
```{r} ```{r}
common[str_detect(common, "x$")] words[str_detect(words, "x$")]
str_subset(common, "x$") str_subset(words, "x$")
``` ```
A variation on `str_detect()` is `str_count()`: rather than a simple yes or no, it tells you how many matches there are in a string: A variation on `str_detect()` is `str_count()`: rather than a simple yes or no, it tells you how many matches there are in a string:
@ -449,7 +449,7 @@ x <- c("apple", "banana", "pear")
str_count(x, "a") str_count(x, "a")
# On average, how many vowels per word? # On average, how many vowels per word?
mean(str_count(common, "[aeiou]")) mean(str_count(words, "[aeiou]"))
``` ```
Note that matches never overlap. For example, in `"abababa"`, how many times will the pattern `"aba"` match? Regular expressions say two, not three: Note that matches never overlap. For example, in `"abababa"`, how many times will the pattern `"aba"` match? Regular expressions say two, not three: