python - How to efficiently calculate a running standard deviation? -
I have an array of lists of numbers, such as:
[0] ( 0.01, 0.02, 0.02, 0.03, 0.02). [N] (0.01, 0.00, 0.01) , 0.05, 0.03)
What I would like to do, efficiently calculates the average and standard deviation in each index of each list in all array elements.
To mean, I am looping through the array and adding value to a given index of a list. Finally, I divide each value into my "average list" by n
.
To make a standard deviation, I loop again, now I have the mean calculation.
I would like to avoid going through the array, once for the meaning and then once going to SD (after one meaning).
Is there any effective way to calculate both values, going through the array only once? Any code (such as a Perl or Python) or pseudocode in a written language is fine.
The answer is the use of Welford's algorithm, which is very clear after "innate methods" Definition:
- Wikipedia:
This is more statistically stable than two-pass or online simple amount. In other reactions, the collector of the suggested sections Stability only really matters when you have a lot of values that curl each other Not because they are going to "" the point of literature.
You may also want to brush on the difference, by dividing by the number of samples (n) and N-1 in the variance calculation (squared deviation), by dividing it by N-1, one of the variation from the sample Unbiased estimates, while decreasing the difference from the divided average on N (because it does not take into account the deviation between the sample mean and the true meaning).
I have written two blog entries on the subject of how the previous details have been removed online, in which to go to more detail:
You can also see a look at applying my Java; Javadoc, source, and unit tests are all online:
Comments
Post a Comment