Skip to content

Instantly share code, notes, and snippets.

@ojdo
Last active August 29, 2016 16:00
Show Gist options
  • Save ojdo/e6a76d76c2b15f0445204364aa8c1911 to your computer and use it in GitHub Desktop.
Save ojdo/e6a76d76c2b15f0445204364aa8c1911 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"metadata": {},
"cell_type": "markdown",
"source": "# Data I/O\n\nSome musings in the current (Aug 2016) state of reading Excel files. (Not shown: writing Excel files currently only seems possible with package [`Taro.jl`](https://github.com/aviks/Taro.jl).) However, ExcelReaders' function `readxlsheet` works decently (though not impressingly fast) for reading several MB worth of mixed-type (string, numeric, NaN) spreadsheets."
},
{
"metadata": {
"collapsed": false,
"trusted": true
},
"cell_type": "code",
"source": "using DataFrames\nusing ExcelReaders",
"execution_count": 1,
"outputs": []
},
{
"metadata": {
"collapsed": false,
"trusted": true
},
"cell_type": "code",
"source": "function read_excel(filename::AbstractString)\n xls = openxl(filename)\n \n # declare dictionary with fixed key and value types, because we can (but don't have to)\n data = Dict{AbstractString, DataFrames.DataFrame}(\n \"commodity\" => readxlsheet(DataFrame, xls, \"Commodity\"),\n \"process\" => readxlsheet(DataFrame, xls, \"Process\"),\n \"process_commodity\" => readxlsheet(DataFrame, xls, \"Process-Commodity\"),\n \"transmission\" => readxlsheet(DataFrame, xls, \"Transmission\"),\n \"storage\" => readxlsheet(DataFrame, xls, \"Storage\"),\n \"demand\" => readxlsheet(DataFrame, xls, \"Demand\"),\n \"supim\" => readxlsheet(DataFrame, xls, \"SupIm\"),\n \"price\" => readxlsheet(DataFrame, xls, \"Buy-Sell-Price\"))\n \n return data\nend",
"execution_count": 2,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "read_excel (generic function with 1 method)"
},
"metadata": {},
"execution_count": 2
}
]
},
{
"metadata": {
"collapsed": false,
"trusted": true
},
"cell_type": "code",
"source": "@time data = read_excel(\"mimo-example.xlsx\")",
"execution_count": 3,
"outputs": [
{
"output_type": "stream",
"text": " 14.792251 seconds (12.92 M allocations: 437.697 MB, 1.91% gc time)\n",
"name": "stdout"
},
{
"output_type": "execute_result",
"data": {
"text/plain": "Dict{AbstractString,DataFrames.DataFrame} with 8 entries:\n \"process_commodity\" => 26×5 DataFrames.DataFrame…\n \"price\" => 8761×3 DataFrames.DataFrame…\n \"supim\" => 8761×10 DataFrames.DataFrame…\n \"demand\" => 8761×4 DataFrames.DataFrame…\n \"storage\" => 6×20 DataFrames.DataFrame…\n \"commodity\" => 34×6 DataFrames.DataFrame…\n \"transmission\" => 6×13 DataFrames.DataFrame…\n \"process\" => 25×13 DataFrames.DataFrame…"
},
"metadata": {},
"execution_count": 3
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Unpack to individual variables for shorter access further down:"
},
{
"metadata": {
"collapsed": false,
"trusted": true
},
"cell_type": "code",
"source": "df_commodity = data[\"commodity\"]\ndf_process = data[\"process\"]\ndf_process_com= data[\"process_commodity\"]\ndf_transmission = data[\"transmission\"]\ndf_storage = data[\"storage\"]\ndf_demand = data[\"demand\"]\ndf_supim = data[\"supim\"]\ndf_price = data[\"price\"]\n;",
"execution_count": 4,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Time a representative access to a single element, similar to Pandas' more elegant `df_commodity.loc[('Mid', 'Coal'), 'price']` syntax. Here, the final array access `[1]` is needed to unwrap the single value from its array."
},
{
"metadata": {
"collapsed": false,
"trusted": true,
"scrolled": true
},
"cell_type": "code",
"source": "@time df_commodity[(df_commodity[:Site] .== \"Mid\") & (df_commodity[:Commodity] .== \"Coal\"), :price][1]",
"execution_count": 11,
"outputs": [
{
"output_type": "stream",
"text": " 0.000153 seconds (50 allocations: 2.156 KB)\n",
"name": "stdout"
},
{
"output_type": "execute_result",
"data": {
"text/plain": "7.0"
},
"metadata": {},
"execution_count": 11
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "# Model\n\nExperiment with multi-dimensional symbolic index sets for semantic variables, similar to mathematical notation. Note that page [Variables in the JuMP documentation](http://jump.readthedocs.io/en/latest/refvariable.html#constructors) states that arbitrary symbols as sets (like strings) are slower than integer"
},
{
"metadata": {
"collapsed": true,
"trusted": true
},
"cell_type": "code",
"source": "using JuMP",
"execution_count": 12,
"outputs": []
},
{
"metadata": {
"collapsed": false,
"trusted": true
},
"cell_type": "code",
"source": "m = Model();",
"execution_count": 13,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "## Sets\n\n### Elementary sets (time, site, commodity, process, storage, transmission)\nElementary sets are simple sequences of identifiers (integers or strings) that enumerate the objects the model will be made of."
},
{
"metadata": {
"trusted": true,
"collapsed": false
},
"cell_type": "code",
"source": "# only cast to integer if timestep labels actually are integers\ntime = round(Integer, unique(df_demand[:t]))\n\nsit = unique(df_commodity[:Site])\ncom = unique(df_commodity[:Commodity])\ncom_type = unique(df_commodity[:Type])\npro = unique(df_process[:Process])\nsto = unique(df_storage[:Storage])\ntra = unique(df_transmission[:Transmission])\n\nprintln(\"Set 'sit' contains: \", sit)\nprintln(\"Set 'pro' contains: \", pro)",
"execution_count": 52,
"outputs": [
{
"output_type": "stream",
"text": "Set 'sit' contains: UTF8String[\"Mid\",\"South\",\"North\"]\nSet 'pro' contains: UTF8String[\"Hydro plant\",\"Wind park\",\"Photovoltaics\",\"Gas plant\",\"Slack powerplant\",\"Lignite plant\",\"Biomass plant\",\"Coal plant\",\"Feed-in\",\"Purchase\"]\n",
"name": "stdout"
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Example for a simple acceptance test; this technique (`@assert`) could be used much more for thorough sanity/plausibility check before trying to construct the model. This cell will yield nothing if the commodity types in variable `com_type` are containted within the expected commodity types. If not, it will raise an AssertionError."
},
{
"metadata": {
"trusted": true,
"collapsed": true
},
"cell_type": "code",
"source": "expected_commodity_types = Set([\"SupIm\", \"Demand\", \"Stock\", \"Env\", \"Sell\", \"Buy\"])\n@assert Set(com_type) ⊆ expected_commodity_types",
"execution_count": 48,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "### Tuple sets\nTuple sets are combinations of elementary set elements that define connections or value assignments. Examples:\n * 'Electricity' is a 'demand' commodity in site 'Mid'.\n * There is a transmission called 'hvac' connecting sites 'North' and 'Mid'.\n * There is storage of type 'bat' in site 'Mid' that can store 'Electricity'.\n \nThese tuple sets then define the domains for the corresponding constraints. For example, only if there is a storage technology of the right type at a certain site, the corresponding capacity constraints will be generated."
},
{
"metadata": {
"collapsed": false,
"trusted": true
},
"cell_type": "code",
"source": "# (site, commodity, type) tuples, e.g. \n# (\"Mid\", \"Elec\", \"Demand\") or (\"South\", \"Solar\", \"SupIm\")\ncom_tuples = [(row[:Site], \n row[:Commodity], \n row[:Type]) \n for row in eachrow(df_commodity)]\n\n# Dictionary of (site, commodity) tuples, accessed by type, e.g. \n# [\"Demand\"] => [(\"North\", Elec), (\"Mid\", Elec\"), (\"South\", \"Elec\")]\ncom_type_tuples = Dict()\nfor ct in com_type\n com_type_tuples[ct] = [(row[:Site], row[:Commodity]) \n for row in eachrow(df_commodity[df_commodity[:Type] .== ct, :])]\nend\n\n# (process, commodity) tuples of output commodities, e.g.\n# (\"Gas turbine\", \"CO2\") or (\"Photovoltaics\", \"Electricity\")\npro_output_tuples = [\n (row[:Process], row[:Commodity]) \n for row in eachrow(df_process_com[df_process_com[:Direction] .== \"Out\", :])]\n\n# (process, commodity) tuples of input commodities, e.g.\n# (\"Gas turbine\", \"Gas\") or (\"Photovoltaics\", \"Solar\")\npro_input_tuples = [\n (row[:Process], row[:Commodity]) \n for row in eachrow(df_process_com[df_process_com[:Direction] .== \"In\", :])]\n\n# (site in, site out, transmission) tuples of transmission processes, e.g.\n# (\"Mid\", \"South\", \"hvac\")\ntra_tuples = [\n (row[symbol(\"Site In\")], row[symbol(\"Site Out\")], row[:Transmission]) \n for row in eachrow(df_transmission)];\n\n# (site, storage, commodity) tuples of storage processes, e.g.\n# (\"Mid\", \"Battery\", \"Electricity\")\nsto_tuples = [\n (row[:Site], row[:Storage], row[:Commodity]) \n for row in eachrow(df_storage)]\n\n# there are some more for the more advanced features (buy-sell, dsm, startup & partial), but\n# the above are sufficient for the urbs model core\n;",
"execution_count": 58,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "## Variables\n\nFinally we use all these sets to define something tangible, the model variables. Here is a sample for the three main time-dependent process variables (input, output and throughput):"
},
{
"metadata": {
"trusted": true,
"collapsed": false
},
"cell_type": "code",
"source": "@time begin\n @variable(m, tau_pro[t in time, p in pro], lowerbound=0);\n @variable(m, epsilon_in[t in time, p in pro_input_tuples] >= 0);\n @variable(m, epsilon_out[t in time, p in pro_output_tuples] >= 0);\nend\n;",
"execution_count": 88,
"outputs": [
{
"output_type": "stream",
"text": " 0.651274 seconds (2.53 M allocations: 57.172 MB)\n",
"name": "stdout"
}
]
}
],
"metadata": {
"language_info": {
"version": "0.4.5",
"mimetype": "application/julia",
"name": "julia",
"file_extension": ".jl"
},
"kernelspec": {
"name": "julia-0.4",
"display_name": "Julia 0.4.5",
"language": "julia"
},
"gist_id": "e6a76d76c2b15f0445204364aa8c1911"
},
"nbformat": 4,
"nbformat_minor": 0
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment