Merge branch 'master' of github.com:hadley/r4ds

This commit is contained in:
hadley 2016-07-20 16:47:58 -05:00
commit 7a410555e8
1 changed files with 7 additions and 7 deletions

View File

@ -108,7 +108,7 @@ measure_distance <- function(mod, data) {
measure_distance(c(2, 5), sim1)
```
Now we can use purrrr to compure the distance for all the models defined above. We need a helper function because our distance function expects the model as a numeric vector of length 2.
Now we can use purrrr to compute the distance for all the models defined above. We need a helper function because our distance function expects the model as a numeric vector of length 2.
```{r}
sim1_dist <- function(a, b) {
@ -130,7 +130,7 @@ ggplot(sim1, aes(x, y)) +
)
```
Another way of looking to think about these models is to draw a scatterplot of `a` a `b`, again coloured colour by `-dist`. We can no longer see how the model compares to the data, but we can see many models at once. Again, I've highlight the 10 best models, this time by drawing red circles underneath them.
Another way of thinking about these models is to draw a scatterplot of `a` and `b`, again coloured by `-dist`. We can no longer see how the model compares to the data, but we can see many models at once. Again, I've highlighted the 10 best models, this time by drawing red circles underneath them.
```{r}
ggplot(models, aes(a, b)) +
@ -138,7 +138,7 @@ ggplot(models, aes(a, b)) +
geom_point(aes(colour = -dist))
```
Instead of trying lots of random models, we could be more systematic and generate an evenly spaced grid of points, a so called grid search. I picked the parameters of the grid rougly by looking at where the best models where in the plot above.
Instead of trying lots of random models, we could be more systematic and generate an evenly spaced grid of points, a so called grid search. I picked the parameters of the grid roughly by looking at where the best models were in the plot above.
```{r}
grid <- expand.grid(
@ -164,7 +164,7 @@ ggplot(sim1, aes(x, y)) +
)
```
Now you could imagine iteratively making the grid finer and finer until you narrowed in on the best model. But there's a better way to tackle that problem: a numerical minimisation tool called Newton-Rhapson search. The intuition of Newton-Rhapson is pretty simple: you pick a starting point and look around for the steepest slope. You then ski down that slope a little way, and then repeat again and again, until you can't go any lower. In R, we can do that with `optim()`:
Now you could imagine iteratively making the grid finer and finer until you narrowed in on the best model. But there's a better way to tackle that problem: a numerical minimisation tool called Newton-Raphson search. The intuition of Newton-Raphson is pretty simple: you pick a starting point and look around for the steepest slope. You then ski down that slope a little way, and then repeat again and again, until you can't go any lower. In R, we can do that with `optim()`:
```{r}
best <- optim(c(0, 0), measure_distance, data = sim1)
@ -175,9 +175,9 @@ ggplot(sim1, aes(x, y)) +
geom_abline(slope = best$par[1], intercept = best$par[2])
```
Don't worry too much about the details - it's the intutition that's important here. If you have a function that defines the distance between a model and a dataset you can use existing mathematical tools to find the best model. The neat thing about this approach is that it will work for family of models that you can write an equation for.
Don't worry too much about the details - it's the intuition that's important here. If you have a function that defines the distance between a model and a dataset you can use existing mathematical tools to find the best model. The neat thing about this approach is that it will work for any family of models that you can write an equation for.
However, this particular model is a special case of a broader family: linear models. A linear model has the general form `y = a_1 * x_1 + a_2 * x_2 + ... + a_n * x_n`. So this simple model is equivalent to a general linear model where n is n, `a_1` is `a`, `x_1` is `x`, `a_2` is `b` and `x_2` is a constant, 1. R has a tool specifically designed for linear models called `lm()`. `lm()` has a special way to specify the model family: a formula like `y ~ x` which `lm()` translates to `y = a * x + b`. We can fit the model and look at the output:
However, this particular model is a special case of a broader family: linear models. A linear model has the general form `y = a_1 * x_1 + a_2 * x_2 + ... + a_n * x_n`. So this simple model is equivalent to a general linear model where n is 2, `a_1` is `a`, `x_1` is `x`, `a_2` is `b` and `x_2` is a constant, 1. R has a tool specifically designed for linear models called `lm()`. `lm()` has a special way to specify the model family: a formula like `y ~ x` which `lm()` translates to `y = a * x + b`. We can fit the model and look at the output:
```{r}
sim1_mod <- lm(y ~ x, data = sim1)
@ -218,7 +218,7 @@ These are exactly the same values we got with `optim()`! However, behind the sce
For simple models, like the one above, you can figure out what the model says about the data by carefully studying the coefficients. And if you ever take a statistics course on modelling, you're likely to spend a lot of time doing just that. Here, however, we're going to take a different tack. In this book, we're going to focus on understanding a model by looking at its predictions. This has a big advantage: every type of model makes predictions (otherwise what use would it be?) so we can use the same set of techniques to understand simple linear models or complex random forrests. We'll see that advantage later on when we explore some other families of models.
We are also going to take advantage of a powerful feature of linear models: they are additive. That means you can partition the data into pattern and residuals. This allows us to see what subtler patterns remain after we have removed the biggest trned.
We are also going to take advantage of a powerful feature of linear models: they are additive. That means you can partition the data into patterns and residuals. This allows us to see what subtler patterns remain after we have removed the biggest trend.
### Predictions