Created
March 11, 2015 01:54
-
-
Save rawkintrevo/31cb13fc017a723ccf33 to your computer and use it in GitHub Desktop.
Single Variate Hierarchical- PyMC
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
Bayesian Statistics and Marketing: Rossi, Allenby, McCullough | |
Section 3.7 | |
This is a remarkably poorly defined model. Almost better off going | |
straight for the deep wizardry on his code than dicking with the | |
text. (At least for model definition, because the code is crap too.) | |
D: | |
The 'cheese' data set has 5555 observations of 88 retailers. | |
For each observation there are VOLUME, DISP, PRICE, this is | |
all weekly and (theoretically) there are 65 weeks for each | |
store. | |
VOLUME: Presumable units of volume | |
DISP: The percentage of units on display | |
PRICE: The unit price | |
Model: | |
Y = ln(VOLUME) | |
X = ln(PRICE) and DISPLAY | |
Given X, Y | |
Y[i] = X_i * beta_i + err_i | |
err_i ~ N( 0, sigma2_i ) | |
sigma2_i = the std_dev of the ith group | |
I_n_i ~ Uniform(0, 15) (intercept?) | |
beta_i ~ delta[i]*X[i] + v_i | |
v_i ~ N( 0, Var( group ) ) | |
delta_i ~ Normal(delta_bar, delta_var) | |
delta_var ~ Uniform(0, 15) <- NO LOGIC TO THESE PRIORS RIGHT NOW | |
delta_bar ~ Uniform(0, 15) | |
In other words: You have a big regression, but each group has its own | |
unique beta estimate (variance is V). | |
""" | |
import pymc as pm | |
import numpy as np | |
import matplotlib.pyplot as plt | |
from pprint import pprint | |
import pandas as pd | |
import rpy2.robjects as robjects | |
r= robjects.r | |
#r("install.packages('bayesm')") | |
import pandas.rpy.common as com | |
cheese = com.load_data("cheese", package='bayesm') | |
X_data = np.log(cheese['PRICE']) | |
Y_data = np.log(cheese['VOLUME']) | |
stores = list(cheese['RETAILER'].unique()) | |
N = len(cheese['RETAILER']) | |
fixed_fx = pm.Normal('fixed_fx', mu=0, tau= X_data.var() ) | |
X = pm.Normal('X', mu= X_data.mean(), tau= X_data.var(), value=X_data.values, observed=True ) | |
random_fx = np.empty(N, dtype=object) | |
local_err = np.empty(N, dtype=object) | |
for store in stores: | |
index = (cheese['RETAILER']==store).values | |
local_var= X_data[index].var() | |
random_fx[index]= pm.Normal('random_fx_%s' % store, mu= 0, tau= local_var) | |
local_err[index]= pm.Uniform('local_err_%s' % store, lower= 0, upper=500, value= local_var) | |
@pm.deterministic | |
def beta(fixed_fx= fixed_fx, random_fx= random_fx): | |
return fixed_fx + random_fx | |
@pm.deterministic | |
def y_hat(beta= beta, X= X, local_err= local_err): | |
return beta * X + local_err | |
err = pm.Uniform('err', lower=0, upper=500) | |
Y = pm.Normal('Y', mu=y_hat, tau=err, value=Y_data.values, observed=True) # | |
model = pm.MCMC([Y, beta, local_err, random_fx, X, fixed_fx, err]) | |
model.sample(100) | |
binwidth=.005 | |
plot_data= model.trace('local_err_ROANOKE (NEW) - KROGER CO')[:] | |
plot_data= model.trace('random_fx_ROANOKE (NEW) - KROGER CO')[:] | |
plt.hist(plot_data, color='c', bins=np.arange(min(plot_data), max(plot_data) + binwidth, binwidth)) | |
plt.show() |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment