This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| xgb_best = H2OXGBoostEstimator(**params_best, | |
| monotone_constraints=mono_constraints) | |
| xgb_best.train(x=features, y=target, training_frame=training_frame, | |
| validation_frame=validation_frame) | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| corr = pd.DataFrame(train[features + | |
| [target]].corr(method='spearman')[target]).iloc[:-1] | |
| corr.columns = ['Spearman Correlation Coefficient'] | |
| values = [int(i) for i in np.sign(corr.values)] | |
| mono_constraints = dict(zip(corr.index, values)) | |
| mono_constraints |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| iter_ = 0 | |
| best_error = 0 | |
| best_iter = 0 | |
| best_model = None | |
| col_sample_rates = [0.1, 0.5, 0.9] | |
| subsamples = [0.1, 0.5, 0.9] | |
| etas = [0.01, 0.001] | |
| max_depths = [3, 6, 12, 15, 18] | |
| reg_alphas = [0.01, 0.001] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| labels_dict = {0:'sadness', 1:'joy', 2:'love', 3:'anger', 4:'fear', 5:'surprise'} | |
| train['description'] = train['label'].map(labels_dict ) | |
| train.head() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import numpy as np | |
| import pandas as pd | |
| import string | |
| import matplotlib | |
| import matplotlib.pyplot as plt | |
| import seaborn as sns | |
| sns.set(style='whitegrid', palette='muted', font_scale=1.2) | |
| colors = ["#01BEFE", "#FFDD00", "#FF7D00", "#FF006D", "#ADFF02", "#8F00FF"] | |
| sns.set_palette(sns.color_palette(colors)) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| !wget https://www.dropbox.com/s/607ptdakxuh5i4s/merged_training.pkl | |
| # Defining a helper function to load the data | |
| import pickle | |
| def load_from_pickle(directory): | |
| return pickle.load(open(directory,"rb")) | |
| # Loading the data | |
| data = load_from_pickle(directory="merged_training.pkl") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # Renaming the column | |
| se.rename(columns = {'Yearly bonus + stocks in EUR': 'Salary with Stocks'}, inplace=True) | |
| # Creating a dataframe salary_exp2 containing salary with stocks and bonuses | |
| se['Salary with Stocks'] = se['Salary with Stocks'].astype(int) | |
| salary_exp2 = se.groupby(['Total years of experience'])['Salary with Stocks'].median().to_frame().reset_index() | |
| salary_exp2[['Total years of experience','Salary with Stocks']] = salary_exp2[['Total years of experience','Salary with Stocks']].astype(int) | |
| salary_exp2.sort_values('Total years of experience',inplace=True) | |
| # Drawing a Radat Chart |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| chart = ctc.Scatter("Median compensation by years of experience") | |
| chart.set_options( | |
| x_label="Experience in Years", | |
| y_label="Salary in USD", | |
| x_tick_count=4, | |
| y_tick_count=3, | |
| dot_size=1, | |
| colors=['#47B39C']) | |
| chart.add_series("Salary",[(z[0], z[1]) for z in list(zip(salary_exp['Total years of experience'],salary_exp['Salary']))]) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| chart = ctc.Line("Median compensation by years of experience") | |
| chart.set_options( | |
| labels=list(salary_exp['Total years of experience']), | |
| x_label="Experience in Years", | |
| y_label="Salary in EUR", | |
| colors=['#EA5F89']) | |
| chart.add_series("Salary", list(salary_exp['Salary'])) | |
| # Calling the load_javascript function when rendering chart first time. | |
| chart.load_javascript() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #Filtering salaryand experience details of only Software Engineers | |
| se = df[df['Position '] == 'Software Engineer'] | |
| se.rename(columns = {'Yearly brutto salary (without bonus and stocks) in EUR': 'Salary'}, inplace=True) | |
| salary_exp = se.groupby(['Total years of experience'])['Salary'].median().to_frame().reset_index() | |
| salary_exp[['Total years of experience','Salary']] = salary_exp[['Total years of experience','Salary']].astype(int) | |
| salary_exp.sort_values('Total years of experience',inplace=True) | |
| salary_exp[:5] |
NewerOlder