common is now words

2016-07-20 11:51:53 -05:00 · 2016-07-20 11:51:53 -05:00 · e56032efe3
parent fa8940ecab
commit e56032efe3
1 changed files with 7 additions and 7 deletions
--- a/strings.Rmd
+++ b/strings.Rmd
@ -418,18 +418,18 @@ Remember that when you use a logical vector in a numeric context, `FALSE` become

 ```{r}
 # How many common words start with t?
-sum(str_detect(common, "^t"))
+sum(str_detect(words, "^t"))
 # What proportion of common words end with a vowel?
-mean(str_detect(common, "[aeiou]$"))
+mean(str_detect(words, "[aeiou]$"))
 ```

 When you have complex logical conditions (e.g. match a or b but not c unless d) it's often easier to combine multiple `str_detect()` calls with logical operators, rather than trying to create a single regular expression. For example, here are two ways to find all words that don't contain any vowels:

 ```{r}
 # Find all words containing at least one vowel, and negate
-no_vowels_1 <- !str_detect(common, "[aeiou]")
+no_vowels_1 <- !str_detect(words, "[aeiou]")
 # Find all words consisting only of consonants (non-vowels)
-no_vowels_2 <- str_detect(common, "^[^aeiou]+$")
+no_vowels_2 <- str_detect(words, "^[^aeiou]+$")
 all.equal(no_vowels_1, no_vowels_2)
 ```

@ -438,8 +438,8 @@ The results are identical, but I think the first approach is significantly easie
 A common use of `str_detect()` is to select the elements that match a pattern. You can do this with logical subsetting, or the convenient `str_subset()` wrapper:

 ```{r}
-common[str_detect(common, "x$")]
-str_subset(common, "x$")
+words[str_detect(words, "x$")]
+str_subset(words, "x$")
 ```

 A variation on `str_detect()` is `str_count()`: rather than a simple yes or no, it tells you how many matches there are in a string:
@ -449,7 +449,7 @@ x <- c("apple", "banana", "pear")
 str_count(x, "a")

 # On average, how many vowels per word?
-mean(str_count(common, "[aeiou]"))
+mean(str_count(words, "[aeiou]"))
 ```

 Note that matches never overlap. For example, in `"abababa"`, how many times will the pattern `"aba"` match? Regular expressions say two, not three: