Use modelr model quality functions

This commit is contained in:
hadley 2016-06-15 15:12:31 -05:00
parent 54e4a916ea
commit 963ed9b915
1 changed files with 3 additions and 3 deletions

View File

@ -241,17 +241,17 @@ When you start dealing with many models, it's helpful to have some rough way of
One way to capture the quality of the model is to summarise the distribution of the residuals. For example, you could look at the quantiles of the absolute residuals. For this dataset, 25% of predictions are less than \$7,400 away, and 75% are less than \$25,800 away. That seems like quite a bit of error when predicting someone's income!
```{r}
quantile(abs(heights$resid), c(0.25, 0.75))
qae(h, heights)
range(heights$income)
```
You might be familiar with the $R^2$. That's a single number summary that rescales the variance of the residuals to between 0 (very bad) and 1 (very good):
```{r}
(var(heights$income) - var(heights$resid)) / var(heights$income)
rsquare(h, heights)
```
This is why the $R^2$ is sometimes interpreted as the amount of variation in the data explained by the model. Here we're explaining 3% of the total variation - not a lot! But I don't think worrying about the relative amount of variation explained is that useful; instead I think you need to consider whether the absolute amount of variation explained is useful for your project.
$R^2$ can be interpreted as the amount of variation in the data explained by the model. Here we're explaining 3% of the total variation - not a lot! But I don't think worrying about the relative amount of variation explained is that useful; instead I think you need to consider whether the absolute amount of variation explained is useful for your project.
It's called the $R^2$ because for simple models like this, it's just the square of the correlation between the variables: