Last active
September 15, 2023 02:48
-
-
Save shane5ul/6289436ace018e6e9e79ee3489be2b8f to your computer and use it in GitHub Desktop.
This program computes the block average of a potentially correlated timeseries "x", and provides error bounds for the estimated mean <x>. As input provide a vector or timeseries "x", and the largest block size.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def blockAverage(datastream, isplot=True, maxBlockSize=0): | |
"""This program computes the block average of a potentially correlated timeseries "x", and | |
provides error bounds for the estimated mean <x>. | |
As input provide a vector or timeseries "x", and the largest block size. | |
Check out writeup in the following blog posts for more: | |
http://sachinashanbhag.blogspot.com/2013/08/block-averaging-estimating-uncertainty_14.html | |
http://sachinashanbhag.blogspot.com/2013/08/block-averaging-estimating-uncertainty.html | |
""" | |
Nobs = len(datastream) # total number of observations in datastream | |
minBlockSize = 1; # min: 1 observation/block | |
if maxBlockSize == 0: | |
maxBlockSize = int(Nobs/4); # max: 4 blocs (otherwise can't calc variance) | |
NumBlocks = maxBlockSize - minBlockSize # total number of block sizes | |
blockMean = np.zeros(NumBlocks) # mean (expect to be "nearly" constant) | |
blockVar = np.zeros(NumBlocks) # variance associated with each blockSize | |
blockCtr = 0 | |
# | |
# blockSize is # observations/block | |
# run them through all the possibilities | |
# | |
for blockSize in range(minBlockSize, maxBlockSize): | |
Nblock = int(Nobs/blockSize) # total number of such blocks in datastream | |
obsProp = np.zeros(Nblock) # container for parcelling block | |
# Loop to chop datastream into blocks | |
# and take average | |
for i in range(1,Nblock+1): | |
ibeg = (i-1) * blockSize | |
iend = ibeg + blockSize | |
obsProp[i-1] = np.mean(datastream[ibeg:iend]) | |
blockMean[blockCtr] = np.mean(obsProp) | |
blockVar[blockCtr] = np.var(obsProp)/(Nblock - 1) # later, Std. Err. Mean | |
blockCtr += 1 | |
v = np.arange(minBlockSize,maxBlockSize) | |
if isplot: | |
plt.subplot(2,1,1) | |
plt.plot(v, np.sqrt(blockVar),'ro-',lw=2) | |
plt.xlabel('block size') | |
plt.ylabel('sem') | |
plt.subplot(2,1,2) | |
plt.errorbar(v, blockMean, np.sqrt(blockVar)) | |
plt.ylabel('<x>') | |
plt.xlabel('block size') | |
print("<x> = {0:f} +/- {1:f}\n".format(blockMean[-1], np.sqrt(blockVar[-1]))) | |
plt.tight_layout() | |
plt.show() | |
return v, blockVar, blockMean |
Thanks for the comment.
Hello, thank you for this code!
I think the following line in line 43 is not correct:
blockVar[blockCtr] = np.var(obsProp)/(Nblock - 1)
see https://numpy.org/doc/stable/reference/generated/numpy.var.html
the np.var already divides by the number of samples N=Nblocks, but when we want to use -1 , specify ddof=1
So the line becomes:
blockVar[blockCtr] = np.var(obsProp, ddof=1)
@hjjonas is right about the variance. However, what you are calculating here is the standard error of the mean (SEM) which has to be divided by N (by sqrt(N) in fact, as it is later done), What is misleading is that in the upper plot the y label is std
and it should be SEM
.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hey thanks a lot for sharing this
I am getting a syntax error with line 60 because I am using python 3.6
Easy solve by putting things into brackets
print (" = {0:f} +/- {1:f}\n".format(blockMean[-1], np.sqrt(blockVar[-1])))