Last active
October 13, 2016 03:49
-
-
Save teoliphant/4407fc8b3511e6baae1c1987d5a083cb to your computer and use it in GitHub Desktop.
Find a good linear relationship between data using the 3 Median Method.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
My 11 year-old son is learning about regression in his 9th grade math class. | |
For them regression is two tables of data and a calculator button. | |
The graphing calculators also provide a button to find the best "median-fit" line and | |
the students were asked to find it as well as the regression line. The regression line can | |
easily be found with numpy.polyfit(x, y, 1). | |
I did not know of a function to calculate the best "median-fit" line. I had to review a few | |
online videos to learn exactly what a best "median-fit" line is and found the 3-median method | |
for determining the best "median-fit" line. It's sometimes called the median median fit. | |
I wrote this implementation to be sure that we understood how the calculator median-fit function | |
was actually working. Someone else might find it useful. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def median(values): | |
svalues = sorted(values) | |
N = len(values) | |
n, r = divmod(N, 2) | |
if r == 1: | |
return float(svalues[n]) | |
else: | |
return (svalues[n-1]+svalues[n])/2.0 | |
def medfit(x, y): | |
# Create 3 groups | |
N = len(x) | |
assert N == len(y) | |
n, r = divmod(N, 3) | |
n1 = n + (1 if r > 0 else 0) | |
n2 = n | |
n3 = n + (1 if r > 1 else 0) | |
assert (n1 + n2 + n3 == N) | |
grp1 = x[:n1], y[:n1] | |
grp2 = x[n1:n1+n2], y[n1:n1+n2] | |
grp3 = x[N-n3:], y[N-n3:] | |
# Summarize the three groups with 3 points | |
# defined as the median of the x points and the y points | |
# in each group. | |
pt1 = median(grp1[0]), median(grp1[1]) | |
pt2 = median(grp2[0]), median(grp2[1]) | |
pt3 = median(grp3[0]), median(grp3[1]) | |
# Calculate the slope from the outer summary points | |
m = (pt3[1] - pt1[1]) / (pt3[0] - pt1[0]) | |
# Intercept of that line | |
b0 = pt1[1] - m * pt1[0] | |
# Correct the intercept using the point 2 | |
# by moving 1/3 of the way towards point 2 | |
y2 = m*pt2[0] + b0 | |
b = b0 + (pt2[1]-y2)/3.0 | |
# The slobe and intercept of the best median-fit line to the data | |
return m, b |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment