ojdo · August 29, 2016 16:00
diff --git a/jump-experiments.ipynb b/jump-experiments.ipynb
 {
 "cells": [
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "# Data I/O\n\nSome musings in the current (Aug 2016) state of reading Excel files. (Not shown: writing Excel files currently only seems possible with package [`Taro.jl`](https://github.com/aviks/Taro.jl).) However, ExcelReaders' function `readxlsheet` works decently (though not impressingly fast) for reading several MB worth of mixed-type (string, numeric, NaN) spreadsheets."
  },
  {
   "metadata": {
    "collapsed": false,
    "trusted": true
   },
   "cell_type": "code",
   "source": "using DataFrames\nusing ExcelReaders",
   "execution_count": 1,
   "outputs": []
  },
  {
   "metadata": {
    "collapsed": false,
    "trusted": true
   },
   "cell_type": "code",
   "source": "function read_excel(filename::AbstractString)\n    xls = openxl(filename)\n    \n    # declare dictionary with fixed key and value types, because we can (but don't have to)\n    data = Dict{AbstractString, DataFrames.DataFrame}(\n        \"commodity\" => readxlsheet(DataFrame, xls, \"Commodity\"),\n        \"process\" => readxlsheet(DataFrame, xls, \"Process\"),\n        \"process_commodity\" => readxlsheet(DataFrame, xls, \"Process-Commodity\"),\n        \"transmission\" => readxlsheet(DataFrame, xls, \"Transmission\"),\n        \"storage\" => readxlsheet(DataFrame, xls, \"Storage\"),\n        \"demand\" => readxlsheet(DataFrame, xls, \"Demand\"),\n        \"supim\" => readxlsheet(DataFrame, xls, \"SupIm\"),\n        \"price\" => readxlsheet(DataFrame, xls, \"Buy-Sell-Price\"))\n    \n    return data\nend",
   "execution_count": 2,
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": "read_excel (generic function with 1 method)"
     },
     "metadata": {},
     "execution_count": 2
    }
   ]
  },
  {
   "metadata": {
    "collapsed": false,
    "trusted": true
   },
   "cell_type": "code",
   "source": "@time data = read_excel(\"mimo-example.xlsx\")",
   "execution_count": 3,
   "outputs": [
    {
     "output_type": "stream",
     "text": " 14.792251 seconds (12.92 M allocations: 437.697 MB, 1.91% gc time)\n",
     "name": "stdout"
    },
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": "Dict{AbstractString,DataFrames.DataFrame} with 8 entries:\n  \"process_commodity\" => 26×5 DataFrames.DataFrame…\n  \"price\"             => 8761×3 DataFrames.DataFrame…\n  \"supim\"             => 8761×10 DataFrames.DataFrame…\n  \"demand\"            => 8761×4 DataFrames.DataFrame…\n  \"storage\"           => 6×20 DataFrames.DataFrame…\n  \"commodity\"         => 34×6 DataFrames.DataFrame…\n  \"transmission\"      => 6×13 DataFrames.DataFrame…\n  \"process\"           => 25×13 DataFrames.DataFrame…"
     },
     "metadata": {},
     "execution_count": 3
    }
   ]
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "Unpack to individual variables for shorter access further down:"
  },
  {
   "metadata": {
    "collapsed": false,
    "trusted": true
   },
   "cell_type": "code",
   "source": "df_commodity = data[\"commodity\"]\ndf_process = data[\"process\"]\ndf_process_com= data[\"process_commodity\"]\ndf_transmission = data[\"transmission\"]\ndf_storage = data[\"storage\"]\ndf_demand = data[\"demand\"]\ndf_supim = data[\"supim\"]\ndf_price = data[\"price\"]\n;",
   "execution_count": 4,
   "outputs": []
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "Time a representative access to a single element, similar to Pandas' more elegant `df_commodity.loc[('Mid', 'Coal'), 'price']` syntax. Here, the final array access `[1]` is needed to unwrap the single value from its array."
  },
  {
   "metadata": {
    "collapsed": false,
    "trusted": true,
    "scrolled": true
   },
   "cell_type": "code",
   "source": "@time df_commodity[(df_commodity[:Site] .== \"Mid\") & (df_commodity[:Commodity] .== \"Coal\"), :price][1]",
   "execution_count": 11,
   "outputs": [
    {
     "output_type": "stream",
     "text": "  0.000153 seconds (50 allocations: 2.156 KB)\n",
     "name": "stdout"
    },
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": "7.0"
     },
     "metadata": {},
     "execution_count": 11
    }
   ]
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "# Model\n\nExperiment with multi-dimensional symbolic index sets for semantic variables, similar to mathematical notation. Note that page [Variables in the JuMP documentation](http://jump.readthedocs.io/en/latest/refvariable.html#constructors) states that arbitrary symbols as sets (like strings) are slower than integer"
  },
  {
   "metadata": {
    "collapsed": true,
    "trusted": true
   },
   "cell_type": "code",
   "source": "using JuMP",
   "execution_count": 12,
   "outputs": []
  },
  {
   "metadata": {
    "collapsed": false,
    "trusted": true
   },
   "cell_type": "code",
   "source": "m = Model();",
   "execution_count": 13,
   "outputs": []
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "## Sets\n\n### Elementary sets (time, site, commodity, process, storage, transmission)\nElementary sets are simple sequences of identifiers (integers or strings) that enumerate the objects the model will be made of."
  },
  {
   "metadata": {
    "trusted": true,
    "collapsed": false
   },
   "cell_type": "code",
   "source": "# only cast to integer if timestep labels actually are integers\ntime     = round(Integer, unique(df_demand[:t]))\n\nsit      = unique(df_commodity[:Site])\ncom      = unique(df_commodity[:Commodity])\ncom_type = unique(df_commodity[:Type])\npro      = unique(df_process[:Process])\nsto      = unique(df_storage[:Storage])\ntra      = unique(df_transmission[:Transmission])\n\nprintln(\"Set 'sit' contains: \", sit)\nprintln(\"Set 'pro' contains: \", pro)",
   "execution_count": 52,
   "outputs": [
    {
     "output_type": "stream",
     "text": "Set 'sit' contains: UTF8String[\"Mid\",\"South\",\"North\"]\nSet 'pro' contains: UTF8String[\"Hydro plant\",\"Wind park\",\"Photovoltaics\",\"Gas plant\",\"Slack powerplant\",\"Lignite plant\",\"Biomass plant\",\"Coal plant\",\"Feed-in\",\"Purchase\"]\n",
     "name": "stdout"
    }
   ]
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "Example for a simple acceptance test; this technique (`@assert`) could be used much more for  thorough sanity/plausibility check before trying to construct the model. This cell will yield nothing if the commodity types in variable `com_type` are containted within the expected commodity types. If not, it will raise an AssertionError."
  },
  {
   "metadata": {
    "trusted": true,
    "collapsed": true
   },
   "cell_type": "code",
   "source": "expected_commodity_types = Set([\"SupIm\", \"Demand\", \"Stock\", \"Env\", \"Sell\", \"Buy\"])\n@assert Set(com_type) ⊆ expected_commodity_types",
   "execution_count": 48,
   "outputs": []
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "### Tuple sets\nTuple sets are combinations of elementary set elements that define connections or value assignments. Examples:\n * 'Electricity' is a 'demand' commodity in site 'Mid'.\n * There is a transmission called 'hvac' connecting sites 'North' and 'Mid'.\n * There is storage of type 'bat' in site 'Mid' that can store 'Electricity'.\n \nThese tuple sets then define the domains for the corresponding constraints. For example, only if there is a storage technology of the right type at a certain site, the corresponding capacity constraints will be generated."
  },
  {
   "metadata": {
    "collapsed": false,
    "trusted": true
   },
   "cell_type": "code",
   "source": "# (site, commodity, type) tuples, e.g. \n# (\"Mid\", \"Elec\", \"Demand\") or (\"South\", \"Solar\", \"SupIm\")\ncom_tuples = [(row[:Site], \n               row[:Commodity], \n               row[:Type]) \n              for row in eachrow(df_commodity)]\n\n# Dictionary of (site, commodity) tuples, accessed by type, e.g. \n# [\"Demand\"] => [(\"North\", Elec), (\"Mid\", Elec\"), (\"South\", \"Elec\")]\ncom_type_tuples = Dict()\nfor ct in com_type\n    com_type_tuples[ct] = [(row[:Site], row[:Commodity]) \n                           for row in eachrow(df_commodity[df_commodity[:Type] .== ct, :])]\nend\n\n# (process, commodity) tuples of output commodities, e.g.\n# (\"Gas turbine\", \"CO2\") or (\"Photovoltaics\", \"Electricity\")\npro_output_tuples = [\n    (row[:Process], row[:Commodity]) \n    for row in eachrow(df_process_com[df_process_com[:Direction] .== \"Out\", :])]\n\n# (process, commodity) tuples of input commodities, e.g.\n# (\"Gas turbine\", \"Gas\") or (\"Photovoltaics\", \"Solar\")\npro_input_tuples = [\n    (row[:Process], row[:Commodity]) \n    for row in eachrow(df_process_com[df_process_com[:Direction] .== \"In\", :])]\n\n# (site in, site out, transmission) tuples of transmission processes, e.g.\n# (\"Mid\", \"South\", \"hvac\")\ntra_tuples = [\n    (row[symbol(\"Site In\")], row[symbol(\"Site Out\")], row[:Transmission]) \n    for row in eachrow(df_transmission)];\n\n# (site, storage, commodity) tuples of storage processes, e.g.\n# (\"Mid\", \"Battery\", \"Electricity\")\nsto_tuples = [\n    (row[:Site], row[:Storage], row[:Commodity]) \n    for row in eachrow(df_storage)]\n\n# there are some more for the more advanced features (buy-sell, dsm, startup & partial), but\n# the above are sufficient for the urbs model core\n;",
   "execution_count": 58,
   "outputs": []
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "## Variables\n\nFinally we use all these sets to define something tangible, the model variables. Here is a sample for the three main time-dependent process variables (input, output and throughput):"
  },
  {
   "metadata": {
    "trusted": true,
    "collapsed": false
   },
   "cell_type": "code",
   "source": "@time begin\n    @variable(m, tau_pro[t in time, p in pro], lowerbound=0);\n    @variable(m, epsilon_in[t in time, p in pro_input_tuples] >= 0);\n    @variable(m, epsilon_out[t in time, p in pro_output_tuples] >= 0);\nend\n;",
   "execution_count": 88,
   "outputs": [
    {
     "output_type": "stream",
     "text": "  0.651274 seconds (2.53 M allocations: 57.172 MB)\n",
     "name": "stdout"
    }
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "version": "0.4.5",
   "mimetype": "application/julia",
   "name": "julia",
   "file_extension": ".jl"
  },
  "kernelspec": {
   "name": "julia-0.4",
   "display_name": "Julia 0.4.5",
   "language": "julia"
  },
  "gist_id": "e6a76d76c2b15f0445204364aa8c1911"
 },
 "nbformat": 4,
 "nbformat_minor": 0
 }
	{
	"cells": [
	{
	"metadata": {},
	"cell_type": "markdown",
	"source": "# Data I/O\n\nSome musings in the current (Aug 2016) state of reading Excel files. (Not shown: writing Excel files currently only seems possible with package [`Taro.jl`](https://github.com/aviks/Taro.jl).) However, ExcelReaders' function `readxlsheet` works decently (though not impressingly fast) for reading several MB worth of mixed-type (string, numeric, NaN) spreadsheets."
	},
	{
	"metadata": {
	"collapsed": false,
	"trusted": true
	},
	"cell_type": "code",
	"source": "using DataFrames\nusing ExcelReaders",
	"execution_count": 1,
	"outputs": []
	},
	{
	"metadata": {
	"collapsed": false,
	"trusted": true
	},
	"cell_type": "code",
	"source": "function read_excel(filename::AbstractString)\n xls = openxl(filename)\n \n # declare dictionary with fixed key and value types, because we can (but don't have to)\n data = Dict{AbstractString, DataFrames.DataFrame}(\n \"commodity\" => readxlsheet(DataFrame, xls, \"Commodity\"),\n \"process\" => readxlsheet(DataFrame, xls, \"Process\"),\n \"process_commodity\" => readxlsheet(DataFrame, xls, \"Process-Commodity\"),\n \"transmission\" => readxlsheet(DataFrame, xls, \"Transmission\"),\n \"storage\" => readxlsheet(DataFrame, xls, \"Storage\"),\n \"demand\" => readxlsheet(DataFrame, xls, \"Demand\"),\n \"supim\" => readxlsheet(DataFrame, xls, \"SupIm\"),\n \"price\" => readxlsheet(DataFrame, xls, \"Buy-Sell-Price\"))\n \n return data\nend",
	"execution_count": 2,
	"outputs": [
	{
	"output_type": "execute_result",
	"data": {
	"text/plain": "read_excel (generic function with 1 method)"
	},
	"metadata": {},
	"execution_count": 2
	}
	]
	},
	{
	"metadata": {
	"collapsed": false,
	"trusted": true
	},
	"cell_type": "code",
	"source": "@time data = read_excel(\"mimo-example.xlsx\")",
	"execution_count": 3,
	"outputs": [
	{
	"output_type": "stream",
	"text": " 14.792251 seconds (12.92 M allocations: 437.697 MB, 1.91% gc time)\n",
	"name": "stdout"
	},
	{
	"output_type": "execute_result",
	"data": {
	"text/plain": "Dict{AbstractString,DataFrames.DataFrame} with 8 entries:\n \"process_commodity\" => 26×5 DataFrames.DataFrame…\n \"price\" => 8761×3 DataFrames.DataFrame…\n \"supim\" => 8761×10 DataFrames.DataFrame…\n \"demand\" => 8761×4 DataFrames.DataFrame…\n \"storage\" => 6×20 DataFrames.DataFrame…\n \"commodity\" => 34×6 DataFrames.DataFrame…\n \"transmission\" => 6×13 DataFrames.DataFrame…\n \"process\" => 25×13 DataFrames.DataFrame…"
	},
	"metadata": {},
	"execution_count": 3
	}
	]
	},
	{
	"metadata": {},
	"cell_type": "markdown",
	"source": "Unpack to individual variables for shorter access further down:"
	},
	{
	"metadata": {
	"collapsed": false,
	"trusted": true
	},
	"cell_type": "code",
	"source": "df_commodity = data[\"commodity\"]\ndf_process = data[\"process\"]\ndf_process_com= data[\"process_commodity\"]\ndf_transmission = data[\"transmission\"]\ndf_storage = data[\"storage\"]\ndf_demand = data[\"demand\"]\ndf_supim = data[\"supim\"]\ndf_price = data[\"price\"]\n;",
	"execution_count": 4,
	"outputs": []
	},
	{
	"metadata": {},
	"cell_type": "markdown",
	"source": "Time a representative access to a single element, similar to Pandas' more elegant `df_commodity.loc[('Mid', 'Coal'), 'price']` syntax. Here, the final array access `[1]` is needed to unwrap the single value from its array."
	},
	{
	"metadata": {
	"collapsed": false,
	"trusted": true,
	"scrolled": true
	},
	"cell_type": "code",
	"source": "@time df_commodity[(df_commodity[:Site] .== \"Mid\") & (df_commodity[:Commodity] .== \"Coal\"), :price][1]",
	"execution_count": 11,
	"outputs": [
	{
	"output_type": "stream",
	"text": " 0.000153 seconds (50 allocations: 2.156 KB)\n",
	"name": "stdout"
	},
	{
	"output_type": "execute_result",
	"data": {
	"text/plain": "7.0"
	},
	"metadata": {},
	"execution_count": 11
	}
	]
	},
	{
	"metadata": {},
	"cell_type": "markdown",
	"source": "# Model\n\nExperiment with multi-dimensional symbolic index sets for semantic variables, similar to mathematical notation. Note that page [Variables in the JuMP documentation](http://jump.readthedocs.io/en/latest/refvariable.html#constructors) states that arbitrary symbols as sets (like strings) are slower than integer"
	},
	{
	"metadata": {
	"collapsed": true,
	"trusted": true
	},
	"cell_type": "code",
	"source": "using JuMP",
	"execution_count": 12,
	"outputs": []
	},
	{
	"metadata": {
	"collapsed": false,
	"trusted": true
	},
	"cell_type": "code",
	"source": "m = Model();",
	"execution_count": 13,
	"outputs": []
	},
	{
	"metadata": {},
	"cell_type": "markdown",
	"source": "## Sets\n\n### Elementary sets (time, site, commodity, process, storage, transmission)\nElementary sets are simple sequences of identifiers (integers or strings) that enumerate the objects the model will be made of."
	},
	{
	"metadata": {
	"trusted": true,
	"collapsed": false
	},
	"cell_type": "code",
	"source": "# only cast to integer if timestep labels actually are integers\ntime = round(Integer, unique(df_demand[:t]))\n\nsit = unique(df_commodity[:Site])\ncom = unique(df_commodity[:Commodity])\ncom_type = unique(df_commodity[:Type])\npro = unique(df_process[:Process])\nsto = unique(df_storage[:Storage])\ntra = unique(df_transmission[:Transmission])\n\nprintln(\"Set 'sit' contains: \", sit)\nprintln(\"Set 'pro' contains: \", pro)",
	"execution_count": 52,
	"outputs": [
	{
	"output_type": "stream",
	"text": "Set 'sit' contains: UTF8String[\"Mid\",\"South\",\"North\"]\nSet 'pro' contains: UTF8String[\"Hydro plant\",\"Wind park\",\"Photovoltaics\",\"Gas plant\",\"Slack powerplant\",\"Lignite plant\",\"Biomass plant\",\"Coal plant\",\"Feed-in\",\"Purchase\"]\n",
	"name": "stdout"
	}
	]
	},
	{
	"metadata": {},
	"cell_type": "markdown",
	"source": "Example for a simple acceptance test; this technique (`@assert`) could be used much more for thorough sanity/plausibility check before trying to construct the model. This cell will yield nothing if the commodity types in variable `com_type` are containted within the expected commodity types. If not, it will raise an AssertionError."
	},
	{
	"metadata": {
	"trusted": true,
	"collapsed": true
	},
	"cell_type": "code",
	"source": "expected_commodity_types = Set([\"SupIm\", \"Demand\", \"Stock\", \"Env\", \"Sell\", \"Buy\"])\n@assert Set(com_type) ⊆ expected_commodity_types",
	"execution_count": 48,
	"outputs": []
	},
	{
	"metadata": {},
	"cell_type": "markdown",
	"source": "### Tuple sets\nTuple sets are combinations of elementary set elements that define connections or value assignments. Examples:\n * 'Electricity' is a 'demand' commodity in site 'Mid'.\n * There is a transmission called 'hvac' connecting sites 'North' and 'Mid'.\n * There is storage of type 'bat' in site 'Mid' that can store 'Electricity'.\n \nThese tuple sets then define the domains for the corresponding constraints. For example, only if there is a storage technology of the right type at a certain site, the corresponding capacity constraints will be generated."
	},
	{
	"metadata": {
	"collapsed": false,
	"trusted": true
	},
	"cell_type": "code",
	"source": "# (site, commodity, type) tuples, e.g. \n# (\"Mid\", \"Elec\", \"Demand\") or (\"South\", \"Solar\", \"SupIm\")\ncom_tuples = [(row[:Site], \n row[:Commodity], \n row[:Type]) \n for row in eachrow(df_commodity)]\n\n# Dictionary of (site, commodity) tuples, accessed by type, e.g. \n# [\"Demand\"] => [(\"North\", Elec), (\"Mid\", Elec\"), (\"South\", \"Elec\")]\ncom_type_tuples = Dict()\nfor ct in com_type\n com_type_tuples[ct] = [(row[:Site], row[:Commodity]) \n for row in eachrow(df_commodity[df_commodity[:Type] .== ct, :])]\nend\n\n# (process, commodity) tuples of output commodities, e.g.\n# (\"Gas turbine\", \"CO2\") or (\"Photovoltaics\", \"Electricity\")\npro_output_tuples = [\n (row[:Process], row[:Commodity]) \n for row in eachrow(df_process_com[df_process_com[:Direction] .== \"Out\", :])]\n\n# (process, commodity) tuples of input commodities, e.g.\n# (\"Gas turbine\", \"Gas\") or (\"Photovoltaics\", \"Solar\")\npro_input_tuples = [\n (row[:Process], row[:Commodity]) \n for row in eachrow(df_process_com[df_process_com[:Direction] .== \"In\", :])]\n\n# (site in, site out, transmission) tuples of transmission processes, e.g.\n# (\"Mid\", \"South\", \"hvac\")\ntra_tuples = [\n (row[symbol(\"Site In\")], row[symbol(\"Site Out\")], row[:Transmission]) \n for row in eachrow(df_transmission)];\n\n# (site, storage, commodity) tuples of storage processes, e.g.\n# (\"Mid\", \"Battery\", \"Electricity\")\nsto_tuples = [\n (row[:Site], row[:Storage], row[:Commodity]) \n for row in eachrow(df_storage)]\n\n# there are some more for the more advanced features (buy-sell, dsm, startup & partial), but\n# the above are sufficient for the urbs model core\n;",
	"execution_count": 58,
	"outputs": []
	},
	{
	"metadata": {},
	"cell_type": "markdown",
	"source": "## Variables\n\nFinally we use all these sets to define something tangible, the model variables. Here is a sample for the three main time-dependent process variables (input, output and throughput):"
	},
	{
	"metadata": {
	"trusted": true,
	"collapsed": false
	},
	"cell_type": "code",
	"source": "@time begin\n @variable(m, tau_pro[t in time, p in pro], lowerbound=0);\n @variable(m, epsilon_in[t in time, p in pro_input_tuples] >= 0);\n @variable(m, epsilon_out[t in time, p in pro_output_tuples] >= 0);\nend\n;",
	"execution_count": 88,
	"outputs": [
	{
	"output_type": "stream",
	"text": " 0.651274 seconds (2.53 M allocations: 57.172 MB)\n",
	"name": "stdout"
	}
	]
	}
	],
	"metadata": {
	"language_info": {
	"version": "0.4.5",
	"mimetype": "application/julia",
	"name": "julia",
	"file_extension": ".jl"
	},
	"kernelspec": {
	"name": "julia-0.4",
	"display_name": "Julia 0.4.5",
	"language": "julia"
	},
	"gist_id": "e6a76d76c2b15f0445204364aa8c1911"
	},
	"nbformat": 4,
	"nbformat_minor": 0
	}