joschu · February 22, 2017 01:16 · May 3, 2016 · May 3, 2016
diff --git a/1-cem-v1-writeup.md b/1-cem-v1-writeup.md
@@ -12,7 +12,7 @@ Note that the same exact parameters were used for all tasks.
 The important parameters are:
 
 - ``hid_sizes=10,5``: hidden layer sizes of MLP
-- ``extra_std=0.01``: noise added to variance, see [1]
+- ``extra_std=0.001``: noise added to variance, see [1]
 - ``batch_size=200``: number of episodes per batch
 - ``seed=0`` random seed.
 

diff --git a/1-cem-v1-writeup.md b/1-cem-v1-writeup.md
@@ -0,0 +1,32 @@
+This is a tiny update to https://gist.github.com/joschu/a21ed1259d3f8c7bdff178fb47bc6fc1#file-1-cem-v0-writeup-md
+
+- I ran experiments on the v1 mujoco environments
+- I reduced the added noise `extra_std` parameter from `0.01` to `0.001`
+
+I used the cross-entropy method (an evolutionary algorithm / derivative free optimization method) to optimize small two-layer neural networks.
+
+Code used to obtain these results can be found at the url
+https://github.com/joschu/modular_rl, commit ba42955b41d7f419470a95d875af1ab7e7ee66fc.
+The command line expression used for all the environments can be found in the text file below.
+Note that the same exact parameters were used for all tasks.
+The important parameters are:
+
+- ``hid_sizes=10,5``: hidden layer sizes of MLP
+- ``extra_std=0.01``: noise added to variance, see [1]
+- ``batch_size=200``: number of episodes per batch
+- ``seed=0`` random seed.
+
+The program is single-threaded and deterministic. I used ``float32`` precision, with ``THEANO_FLAGS=floatX=float32``.
+
+The following instructions commands will let you conveniently run all of the experiments at once.
+
+1. Find a computer with many cpus. 
+2. If it's a headless computer, ``sudo apt-get install xvfb``. Then type ``xvfb-run /bin/bash -s "-screen 0 1400x900x24"`` to enter a shell where all your commands will benefit from a fake monitor provided by xvfb.
+2. Navigate into the ``modular-rl`` directory.
+3. ``export THEANO_FLAGS=floatX=float32; export outdir=/YOUR/PATH/HERE; export NUM_CPUS=YOUR_NUMBER_OF_CPUS``
+4. Move `2-cem-scripts.txt` into the `modular-rl` directory
+5. Run all experiments with the following command ``cat 2-cem-scripts.txt | xargs -n 1 -P $NUM_CPUS bash -c``. 
+
+You can also set `--video=0` in these scripts to disable video recording. If video is disabled, you won't need the xvfb commands.
+
+[1] Szita, István, and András Lörincz. "Learning Tetris using the noisy cross-entropy method." Neural computation 18.12 (2006): 2936-2941.
diff --git a/2-cem-scripts.txt b/2-cem-scripts.txt
@@ -0,0 +1,9 @@
+"python run_cem.py --n_iter=250 --batch_size=200 --agent=modular_rl.agentzoo.DeterministicAgent --hid_sizes=10,5 --env=Walker2d-v1 --extra_std=0.001 --seed=0 --outfile=$outdir/cem10-5-walker"
+"python run_cem.py --n_iter=250 --batch_size=200 --agent=modular_rl.agentzoo.DeterministicAgent --hid_sizes=10,5 --env=Swimmer-v1 --extra_std=0.001 --seed=0 --outfile=$outdir/cem10-5-swimmer"
+"python run_cem.py --n_iter=250 --batch_size=200 --agent=modular_rl.agentzoo.DeterministicAgent --hid_sizes=10,5 --env=Hopper-v1 --extra_std=0.001 --seed=0 --outfile=$outdir/cem10-5-hopper"
+"python run_cem.py --n_iter=250 --batch_size=200 --agent=modular_rl.agentzoo.DeterministicAgent --hid_sizes=10,5 --env=Ant-v1 --extra_std=0.001 --seed=0 --outfile=$outdir/cem10-5-ant"
+"python run_cem.py --n_iter=250 --batch_size=200 --agent=modular_rl.agentzoo.DeterministicAgent --hid_sizes=10,5 --env=InvertedPendulum-v1 --extra_std=0.001 --seed=0 --outfile=$outdir/cem10-5-ip"
+"python run_cem.py --n_iter=250 --batch_size=200 --agent=modular_rl.agentzoo.DeterministicAgent --hid_sizes=10,5 --env=InvertedDoublePendulum-v1 --extra_std=0.001 --seed=0 --outfile=$outdir/cem10-5-idp"
+"python run_cem.py --n_iter=250 --batch_size=200 --agent=modular_rl.agentzoo.DeterministicAgent --hid_sizes=10,5 --env=Reacher-v1 --extra_std=0.001 --seed=0 --outfile=$outdir/cem10-5-reacher"
+"python run_cem.py --n_iter=250 --batch_size=200 --agent=modular_rl.agentzoo.DeterministicAgent --hid_sizes=10,5 --env=HalfCheetah-v1 --extra_std=0.001 --seed=0 --outfile=$outdir/cem10-5-hc"
+"python run_cem.py --n_iter=250 --batch_size=200 --agent=modular_rl.agentzoo.DeterministicAgent --hid_sizes=10,5 --env=Humanoid-v1 --extra_std=0.001 --seed=0 --outfile=$outdir/cem10-5-humanoid"