 |
Variance and the Standard Deviation
Using variance and standard deviation is an extremely meaningful method of measuring data variability.
-
The determination of the variance differs from average deviation in the following manner. Instead of taking the absolute value to eliminate the negative signs, the deviations are squared. Then, instead of being divided by the number of data points, the sum of the squared values is divided by the number of data points minus one. The variance for Sample 1 ( 1, 2, 3, 4, 5) is as follows:
(-2)2, (-1)2, (0)2, (1)2 , (2)2
(5 -1 )
|
= |
4 + 1 + 0 + 1 + 4
4
|
= |
2.5 |
The variance for Sample 2 ( 2, 3, 3, 3, 4 ):
|
(-1)2, (0)2, (0)2, (0)2 , (1)2
(5-1) |
= |
1 + 0 + 0 + 0 +1
4 |
= |
0.5 |
- The standard deviation is the positive square root of the variance. Therefore for Sample 1 the standard deviation (SD) is:
SD = = 1.6
For sample 2:
SD = = 0.71
Comparison:
Sample 1 - average deviation = 1.2, SD = 1.6
Sample 2 - average deviation = 0.5, SD = 0.71
The standard deviation shows a larger deviation than the average deviation leads us to believe. While this may seem a strange and confusing way of measuring variability, try to understand this method in the following way. If you ignore the square root for a moment and just consider the variance, this expresses the average of the squared differences between the values and the average, rather than the average of the absolute differences.
Squaring the differences causes the expression to be dominated by the largest differences, while the comparatively small ones become insignificant. This has the net effect of emphasizing large deviations from the average, while de-emphasizing small ones. For example, 2 squared = 4, while 5 squared = 25, a much larger number. Squaring the larger number makes a larger impact. As a result, the expression inside the square root is considerably more "sensitive" than the average deviation.
In summary, to calculate the standard deviation:
- Calculate the average of the data set.
- Subtract the average from each data point to find the difference.
- Eliminate the negative signs and square the differences.
- Add the squared differences together.
- Divide the sum by the number of data points minus one.
- Take the square root of the result.
|
The numerical expression for this method is as follows:
and standard deviation (SD) = the square root of the variance, therefore:
|
|