Clarify why histogram starts below 0, closes #724

This commit is contained in:
Mine Çetinkaya-Rundel 2022-05-09 12:31:46 -04:00
parent b6809a8a9c
commit fa8218035f
1 changed files with 3 additions and 1 deletions

View File

@ -141,7 +141,9 @@ diamonds |>
```
A histogram divides the x-axis into equally spaced bins and then uses the height of a bar to display the number of observations that fall in each bin.
In the graph above, the tallest bar shows that almost 30,000 observations have a `carat` value between 0.25 and 0.75, which are the left and right edges of the bar.
Note that even though it's not possible to have a `carat` value that is smaller than 0 (since weights of diamonds, by definition, are positive values), the bins start at a negative value (-0.25) in order to create bins of equal width across the range of the data with the center of the first bin at 0.
This behavior is also apparent in the histogram above, where the first bar ranges from -0.25 to 0.25.
The tallest bar shows that almost 30,000 observations have a `carat` value between 0.25 and 0.75, which are the left and right edges of the bar centered at 0.5.
You can set the width of the intervals in a histogram with the `binwidth` argument, which is measured in the units of the `x` variable.
You should always explore a variety of binwidths when working with histograms, as different binwidths can reveal different patterns.