The standard deviation is an element of statistics, which serves to give us a measure of dispersion or variability . This variability is measured from the dispersion that exists within a set of sampled data, with respect to the average or the mean of these same data. It can also be called standard deviation and is widely used in statistics to estimate the probability that a process or event will happen .
As this deviation is larger, there is a greater spread in the data. This measure is very sensitive to extreme values, that is, values very different from the mean. If the standard deviation is zero, it indicates that there is no variation between the data.
Index
What is Standard Deviation and what is it for?
The applications of this statistical tool are many, it is useful and is used in various areas, from economics to biology and medicine. Some of these applications are mentioned below:
- It allows comparing data from two different populations through the mean of each of these. This can be done through a method known as typing, in which it is estimated how many standard deviations a data moves away from its mean.
- Helps to assess risk in decision making. This decision-making can occur in investment cases, where a larger standard deviation presents a greater loss.
- It allows you to check if a data set fits or meets a certain theory. If the deviation of these data is very high, then they do not meet the assumption.
- It helps to find a way to minimize errors when meeting certain objectives in an investigation. That is, in what situations can the objectives be achieved, with the least investment of time, money or resources in general.
- It allows estimating the trend of a data set or a variable. That is, if the studied data set, as time passes, tends to a greater dispersion or convergence centered on the mean.
Most parameters in statistics focus on the assumption that the distribution of the data is a normal or symmetric distribution.
The calculation of the standard deviation gives us more information when the distribution is symmetric. The distribution of a data set is symmetric when most of these values are concentrated in the center, rather than at the extremes.
What does the standard deviation measure?
The standard deviation measures the dispersion, that is, it indicates how dispersed the data provided is, with respect to the mean. The larger the standard deviation, the greater the spread of such data should be.
How is the Standard Deviation calculated?
The standard deviation is calculated using a statistical formula, applied to a sample.
A sample is a set of data or observations taken from a population or from a much larger data set. Sampling is carried out to have data that are easier to handle and the behavior of these can be extrapolated to the behavior of an entire population.
Next, we show you from which formula it is calculated:
What is the symbol for the standard deviation?
The symbol used to designate the standard deviation in statistics is ( σ ) the Greek letter sigma.
Standard Deviation Formula
This index has the following formula:
√((∑▒〖(Xi-X)〗^2 )/(n-1))
This formula is read as the square root of the difference of each observation (Xi) and the average (X), squared, between the total number of observations in the sample (n) minus 1.
Donde:
∑▒〖(Xi-X)〗^2
Represents a sum.
The way this formula is written is important; in the case of differences between the observations and the average, if an observation is less than the average, the difference will be negative.
If an observation is greater than the mean, the difference will be positive. In this way, if the distribution is symmetric, an equilibrium around the average is achieved.
This difference is squared so that the total sum is positive.
In the case of talking about the mean of a sample, not a population, the form is used, in which the restrictions are indicated (one in this case) and also decreases the degree of error when calculating or estimating the mean of the entire population.
Secondly:
(∑▒〖(Xi-X)〗^2 )/(n-1)
It represents the variance of this data set, so the standard deviation can also be described as the square root of the variance.
Standard deviation and variance
Variance is a statistical index that measures the variability of a sampled data set. The variance, as already noted above, is highly related to the standard deviation and they even have a similar interpretation, however, they represent two different concepts.
The variance is designated by the Greek letter sigma squared (σ2), so it can also be interpreted as the standard deviation squared.
Both differ in that, the deviation indicates the dispersion of a data set, while the variance indicates the magnitude or variability of that dispersion.
When calculating the variance, a positive value will always be obtained, so the minimum value that can be reached is zero.
How is the interpretation of the Standard Deviation?
The standard deviation of a set of observations in a sample can tell us how accurate they are, or how close they are to reality. This is why this measure can be interpreted as an index of mistrust.
As this deviation is larger, there is a greater dispersion in the data, so they are further from reality.
A good interpretation of this is given more easily by observing the graph of the famous Gaussian bell, in which a symmetric distribution of the data is shown.
In this, 68% of the observations are within one standard deviation of the mean, 95% of the data are within two standard deviations, and 99.7% are within three standard deviations.
At higher values of standard deviations, this function will appear flatter and thicker, while at lower standard deviations it will appear narrower and taller.
The result of the standard deviation always gives positive results or, in the case that all the sampled data are equal, the standard deviation is zero.
Examples of Standard Deviation
Usually the way the standard deviation is calculated is as follows.
First, we calculate the mean or average of the data set.
Second, we proceed to calculate the variance.
Third, finally, the standard deviation. This presents the same units as the sampled data.
Here are some examples:
Suppose we asked the ages of 14 students in a particular classroom. These students presented the following ages: 12, 13, 12, 12, 11, 13, 13, 13, 12, 12, 11, 11, 12, 13.
We then proceed to calculate the average (X).
To perform this calculation, the sum of all these data must be carried out.
This resulting value is divided by the total number of data collected (14). This gives us the following result: X = 12.14.
Then we proceed to calculate the variance according to the formula:
(∑▒ 〖(Xi-X)〗 ^ 2) / (n-1)
From which we have the following result:
σ2 = 0.59.
Finally, we calculate the standard deviation, simply taking the square root of the variance, obtaining that the standard deviation is:
σ = 0.77.
The way to express the results would be as follows:
12.14 ± 0.77.
From this we get an interval, 12.14-0.77 – 12.14 + 0.77, where the majority of the ages in this classroom are within that interval.
Now suppose a set of data obtained on the grades in an exam in a particular subject: 18, 20, 14, 10, 18, 15, 16, 10, 19, 13.
The mean in this case is: X = 15, 3
The calculated variance is as follows: σ2 = 12.7.
Finally, the standard deviation is: σ = 3.6.
Expressing the results as follows: 15.3 ± 3.6. Obtaining an interval 15.3-3.6 – 15.3 + 3.6.
Comparing these two examples, although they are different variables, we can see that the ages of the students in a classroom present a lower dispersion of data and a lower variability in said dispersion, than in the case of the grades obtained in a particular subject since the standard deviation is higher in this case.
Conclusion
To conclude, it should be noted that this information so precise and mainly linked to statistics has daily application, since it allows us to carry out the analysis of the data we obtain in different investigations, for example, it can be used if we want to analyze survey results, In order to determine probabilities, establish a reference value to estimate the behavior of a process.
In short, this calculation allows us to know from a series of data which are within normal parameters and which are not.
Dr. Samantha Robson ( CRN: 0510146-5) is a nutritionist and website content reviewer related to her area of expertise. With a postgraduate degree in Nutrition from The University of Arizona, she is a specialist in Sports Nutrition from Oxford University and is also a member of the International Society of Sports Nutrition.