Skip to content

Instantly share code, notes, and snippets.

@ocefpaf
Created June 16, 2022 19:23
Show Gist options
  • Save ocefpaf/43943009453c72ad5bf294874bfda0f2 to your computer and use it in GitHub Desktop.
Save ocefpaf/43943009453c72ad5bf294874bfda0f2 to your computer and use it in GitHub Desktop.
scoring-gve-distribution
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"metadata": {
"trusted": true
},
"id": "2bec0f14",
"cell_type": "code",
"source": "import pandas as pd\n\n\ndf = pd.read_csv(\"Brazil-06-16_14_09_22.csv\")",
"execution_count": 1,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"id": "c2fa68ba",
"cell_type": "code",
"source": "def score_programming(data):\n \"\"\"https://github.com/oceanhackweek/admin/issues/41#issuecomment-1157692167\"\"\"\n nbox = len(data.split(\"\\n\"))\n score = 0\n if nbox <= 3:\n score = 1\n elif nbox >=4 and nbox < 6:\n score = 2\n elif nbox >=6:\n score = 3\n else:\n raise ValueError(f\"Could not evalute score for {nbox}.\")\n return score\n\n\ncol = \"For the primary programming language you listed above, please check the boxes of the tasks you can execute. Check all that apply.\"\n\nscores = [score_programming(data) for data in df[col]]\n\ndf[\"score programming\"] = scores",
"execution_count": 2,
"outputs": []
},
{
"metadata": {},
"id": "9ef72a79",
"cell_type": "markdown",
"source": "### Oceanographic Subfields"
},
{
"metadata": {
"trusted": true
},
"id": "f7bda223",
"cell_type": "code",
"source": "import re\n\n\nsubfields = [re.split(\"\\n|, \", s.lower()) for s in df[\"In which field(s) does your research interest fall under? Select all that apply.\"]]\nflat_list = [answer for applicant in subfields for answer in applicant]\n\nunique = sorted(set(flat_list))\n\nunique",
"execution_count": 3,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 3,
"data": {
"text/plain": "['biological oceanography',\n 'chemical oceanography',\n 'data science',\n 'geology and geophysics',\n 'meteorology',\n 'ocean engineering',\n 'physical oceanography',\n 'resource management']"
},
"metadata": {}
}
]
},
{
"metadata": {
"trusted": true
},
"id": "c11b1f03",
"cell_type": "code",
"source": "compose = {}\n\nfor k, applicant in enumerate(subfields):\n compose.update({k: [True if e in sorted(applicant) else False for e in unique]})\n\n\nsubfields = pd.DataFrame(compose).T\nsubfields.columns = unique",
"execution_count": 4,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"id": "deae8572",
"cell_type": "code",
"source": "subfields.sum().plot.bar();",
"execution_count": 5,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 432x288 with 1 Axes>",
"image/png": "\n"
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"metadata": {},
"id": "926aed42",
"cell_type": "markdown",
"source": "### Diversity"
},
{
"metadata": {
"scrolled": false,
"trusted": true
},
"id": "1b81a35d",
"cell_type": "code",
"source": "diversity = \"In terms of ethnic identity, do you consider yourself a minority with respect to your research field?\"\ngender = \"In terms of gender identity, do you consider yourself a minority with respect to your research field?\"\n\ndf[[diversity, gender]].describe()",
"execution_count": 6,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 6,
"data": {
"text/plain": " In terms of ethnic identity, do you consider yourself a minority with respect to your research field? \\\ncount 21 \nunique 2 \ntop No \nfreq 17 \n\n In terms of gender identity, do you consider yourself a minority with respect to your research field? \ncount 21 \nunique 2 \ntop No \nfreq 13 ",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>In terms of ethnic identity, do you consider yourself a minority with respect to your research field?</th>\n <th>In terms of gender identity, do you consider yourself a minority with respect to your research field?</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>count</th>\n <td>21</td>\n <td>21</td>\n </tr>\n <tr>\n <th>unique</th>\n <td>2</td>\n <td>2</td>\n </tr>\n <tr>\n <th>top</th>\n <td>No</td>\n <td>No</td>\n </tr>\n <tr>\n <th>freq</th>\n <td>17</td>\n <td>13</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Looking at the name we have 12 female 9 male and 13 said they **do not** consider themselves minority in their research field."
},
{
"metadata": {},
"id": "7b65b392",
"cell_type": "markdown",
"source": "### Language"
},
{
"metadata": {
"trusted": true
},
"id": "cf8050a6",
"cell_type": "code",
"source": "col = \"Please rank the programming languages, up to 3, that you are most familiar with.\"\ndf[col].describe()",
"execution_count": 7,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 7,
"data": {
"text/plain": "count 22\nunique 21\ntop 2\nfreq 2\nName: Please rank the programming languages, up to 3, that you are most familiar with., dtype: object"
},
"metadata": {}
}
]
},
{
"metadata": {
"trusted": true
},
"id": "12e033a0",
"cell_type": "code",
"source": "ax = df[\"score programming\"].T.plot.hist(bins=4)\nax.set_xticks([1, 2, 3]);",
"execution_count": 8,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": "<Figure size 432x288 with 1 Axes>",
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAX4AAAD4CAYAAADrRI2NAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAAsTAAALEwEAmpwYAAAM30lEQVR4nO3de6xlZX3G8e/DDIaLGDQcW8rFIw3BEmILHmxTWhtBEoQK2iumGmup06Y3aJvU0Zhi/2iCSYvatGkdESuUYsullEpvYEFiguAZoBUYDEYRRmg51rQDlojgr3/sjZ2Oh5k1Z/Zai7Pf7yc5mbXW2bPe54+dZ95599prpaqQJLVjv7EDSJKGZfFLUmMsfklqjMUvSY2x+CWpMRvHDtDFYYcdVouLi2PHkKR1ZevWrV+tqoVdj6+L4l9cXGR5eXnsGJK0riT58mrHXeqRpMZY/JLUGItfkhpj8UtSYyx+SWqMxS9JjbH4JakxFr8kNcbil6TGrItv7u6Lxc03jB1hXXnworPGjiCpZ874JakxFr8kNcbil6TGWPyS1BiLX5IaY/FLUmMsfklqjMUvSY2x+CWpMRa/JDXG4pekxlj8ktQYi1+SGmPxS1Jjeiv+JJcmeSzJPTsde0mSG5M8MP3zxX2NL0laXZ8z/j8Hztjl2Gbgk1V1LPDJ6b4kaUC9FX9V3Qp8bZfD5wAfm25/DHhjX+NLklY39Br/d1XVowDTP1868PiS1Lzn7Ye7STYlWU6yvLKyMnYcSZobQxf/fyQ5HGD652PP9cKq2lJVS1W1tLCwMFhASZp3Qxf/9cDbpttvA/524PElqXl9Xs55JXAbcFyS7UnOAy4CTk/yAHD6dF+SNKCNfZ24qt78HL86ra8xJUl79rz9cFeS1A+LX5IaY/FLUmMsfklqjMUvSY2x+CWpMRa/JDXG4pekxlj8ktQYi1+SGmPxS1JjLH5JaozFL0mN6e3unFILFjffMHaEdeXBi84aO4Jwxi9JzbH4JakxFr8kNcbil6TGWPyS1BiLX5IaY/FLUmMsfklqjMUvSY2x+CWpMRa/JDXG4pekxlj8ktQYi1+SGmPxS1JjRin+JL+Z5N4k9yS5MskBY+SQpBYNXvxJjgB+A1iqqhOADcC5Q+eQpFaNtdSzETgwyUbgIOCRkXJIUnMGL/6q+grwB8BDwKPAf1fVP+/6uiSbkiwnWV5ZWRk6piTNrTGWel4MnAO8HPge4OAkb9n1dVW1paqWqmppYWFh6JiSNLfGWOp5HfClqlqpqm8C1wI/PEIOSWrSGMX/EPBDSQ5KEuA0YNsIOSSpSWOs8d8OXA3cCXxummHL0DkkqVUbxxi0qi4ELhxjbElqnd/claTGWPyS1BiLX5IaY/FLUmMsfklqjMUvSY2x+CWpMRa/JDXG4pekxlj8ktQYi1+SGmPxS1JjLH5Jakyn4k9yQt9BJEnD6Drj/7MkdyT5lSSH9hlIktSvTsVfVT8C/BxwFLCc5C+TnN5rMklSLzqv8VfVA8B7gHcCPwb8UZL7k/xEX+EkSbPXdY3/lUnez+TZuKcCb6iq75tuv7/HfJKkGev66MU/Bj4MvLuqnnz2YFU9kuQ9vSSTJPWia/GfCTxZVc8AJNkPOKCq/qeqLu8tnSRp5rqu8d8EHLjT/kHTY5KkdaZr8R9QVU88uzPdPqifSJKkPnUt/q8nOenZnSSvAp7czeslSc9TXdf4LwCuSvLIdP9w4Gd7SSRJ6lWn4q+qzyZ5BXAcEOD+qvpmr8kkSb3oOuMHOBlYnP6dE5NQVZf1kkqS1JtOxZ/kcuB7gbuBZ6aHC7D4JWmd6TrjXwKOr6rqM4wkqX9dr+q5B/juPoNIkobRdcZ/GHBfkjuAbzx7sKrOXsug01s7XwKcwGTJ6Beq6ra1nEuStHe6Fv97ZzzuB4F/rKqfSvIC/DKYJA2m6+Wcn0ryMuDYqropyUHAhrUMmORFwGuAn5+e+yngqbWcS5K097relvkdwNXAh6aHjgCuW+OYxwArwEeT3JXkkiQHrzLmpiTLSZZXVlbWOJQkaVddP9z9VeAUYAd8+6EsL13jmBuBk4A/raoTga8Dm3d9UVVtqaqlqlpaWFhY41CSpF11Lf5vTJdkAEiykcmHsmuxHdheVbdP969m8g+BJGkAXYv/U0neDRw4fdbuVcDfrWXAqvp34OEkx00PnQbct5ZzSZL2XterejYD5wGfA34J+Hsml2Ou1a8DV0yv6Pki8PZ9OJckaS90varnW0wevfjhWQxaVXcz+TawJGlgXe/V8yVWWdOvqmNmnkiS1Ku9uVfPsw4Afhp4yezjSJL61unD3ar6z51+vlJVHwBO7TeaJKkPXZd6dr7ccj8m/wM4pJdEkqRedV3q+cOdtp8GHgR+ZuZpJEm963pVz2v7DiJJGkbXpZ7f2t3vq+ri2cSRJPVtb67qORm4frr/BuBW4OE+QkmS+rM3D2I5qaoeB0jyXuCqqvrFvoJJkvrR9V49R/P/75n/FLA48zSSpN51nfFfDtyR5G+YfIP3TcBlvaWSJPWm61U9v5/kH4AfnR56e1Xd1V8sSVJfui71wOS5uDuq6oPA9iQv7ymTJKlHXR+9eCHwTuBd00P7A3/RVyhJUn+6zvjfBJzN5DGJVNUjeMsGSVqXuhb/U1VVTG/NvNrD0SVJ60PX4v/rJB8CDk3yDuAmZvRQFknSsPZ4VU+SAH8FvALYARwH/G5V3dhzNklSD/ZY/FVVSa6rqlcBlr0krXNdl3o+k+TkXpNIkgbR9Zu7rwV+OcmDTK7sCZP/DLyyr2CSpH7stviTHF1VDwGvHyiPJKlne5rxX8fkrpxfTnJNVf3kAJkkST3a0xp/dto+ps8gkqRh7Kn46zm2JUnr1J6Wer4/yQ4mM/8Dp9vwfx/uvqjXdJKkmdtt8VfVhqGCSJKGsTe3ZZYkzYHRij/JhiR3JfnEWBkkqUVjzvjPB7aNOL4kNWmU4k9yJHAWcMkY40tSy8aa8X8A+B3gW8/1giSbkiwnWV5ZWRksmCTNu8GLP8mPA49V1dbdva6qtlTVUlUtLSwsDJROkubfGDP+U4Czpzd8+zhwahKf3ytJAxm8+KvqXVV1ZFUtAucC/1JVbxk6hyS1yuv4JakxXe/H34uqugW4ZcwMktQaZ/yS1BiLX5IaY/FLUmMsfklqjMUvSY2x+CWpMRa/JDXG4pekxlj8ktQYi1+SGmPxS1JjLH5JaozFL0mNsfglqTEWvyQ1xuKXpMZY/JLUGItfkhpj8UtSYyx+SWqMxS9JjbH4JakxFr8kNcbil6TGWPyS1BiLX5IaY/FLUmMsfklqjMUvSY0ZvPiTHJXk5iTbktyb5PyhM0hSyzaOMObTwG9X1Z1JDgG2Jrmxqu4bIYskNWfwGX9VPVpVd063Hwe2AUcMnUOSWjXGjP/bkiwCJwK3r/K7TcAmgKOPPnrYYJJ6sbj5hrEjrDsPXnTWzM852oe7SV4IXANcUFU7dv19VW2pqqWqWlpYWBg+oCTNqVGKP8n+TEr/iqq6dowMktSqMa7qCfARYFtVXTz0+JLUujFm/KcAbwVOTXL39OfMEXJIUpMG/3C3qj4NZOhxJUkTfnNXkhpj8UtSYyx+SWqMxS9JjbH4JakxFr8kNcbil6TGWPyS1BiLX5IaY/FLUmMsfklqjMUvSY2x+CWpMRa/JDXG4pekxlj8ktQYi1+SGmPxS1JjLH5JaozFL0mNsfglqTEWvyQ1xuKXpMZY/JLUGItfkhpj8UtSYyx+SWqMxS9JjbH4JakxoxR/kjOSfD7JF5JsHiODJLVq8OJPsgH4E+D1wPHAm5McP3QOSWrVGDP+VwNfqKovVtVTwMeBc0bIIUlN2jjCmEcAD++0vx34wV1flGQTsGm6+0SSz69xvMOAr67x7zYn7xs7wbrj+0u9yvv26T32stUOjlH8WeVYfceBqi3Aln0eLFmuqqV9PY+0Gt9f6lsf77Exlnq2A0fttH8k8MgIOSSpSWMU/2eBY5O8PMkLgHOB60fIIUlNGnypp6qeTvJrwD8BG4BLq+reHofc5+UiaTd8f6lvM3+Ppeo7ltclSXPMb+5KUmMsfklqzNwWf5JLkzyW5J6xs2j+JDkqyc1JtiW5N8n5Y2fS/EhyQJI7kvzr9P31ezM9/7yu8Sd5DfAEcFlVnTB2Hs2XJIcDh1fVnUkOAbYCb6yq+0aOpjmQJMDBVfVEkv2BTwPnV9VnZnH+uZ3xV9WtwNfGzqH5VFWPVtWd0+3HgW1MvpUu7bOaeGK6u//0Z2az9LktfmkoSRaBE4HbR46iOZJkQ5K7gceAG6tqZu8vi1/aB0leCFwDXFBVO8bOo/lRVc9U1Q8wubvBq5PMbMna4pfWaLr2eg1wRVVdO3Yezaeq+i/gFuCMWZ3T4pfWYPrh20eAbVV18dh5NF+SLCQ5dLp9IPA64P5ZnX9uiz/JlcBtwHFJtic5b+xMmiunAG8FTk1y9/TnzLFDaW4cDtyc5N+Y3N/sxqr6xKxOPreXc0qSVje3M35J0uosfklqjMUvSY2x+CWpMRa/JDXG4pekxlj8ktSY/wXL6NbzP9CR9AAAAABJRU5ErkJggg==\n"
},
"metadata": {
"needs_background": "light"
}
}
]
}
],
"metadata": {
"_draft": {
"nbviewer_url": "https://gist.github.com/3100d7b04ca56ff99cc0431fc10ebb3d"
},
"gist": {
"id": "3100d7b04ca56ff99cc0431fc10ebb3d",
"data": {
"description": "scoring-gve-distribution",
"public": true
}
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3 (ipykernel)",
"language": "python"
},
"language_info": {
"name": "python",
"version": "3.10.5",
"mimetype": "text/x-python",
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"pygments_lexer": "ipython3",
"nbconvert_exporter": "python",
"file_extension": ".py"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment