Flatland

RLLib Baselines on Colab!

We have taken the repo from https://gitlab.aicrowd.com/flatland/neurips2020-flatland-baselines 16 and made it into a simple colab notebook.

All training scripts are also provided, so one can modify the configs and do runs of their own. Evaluation is also run and the script to calculate scores on an independent test set is also provided.

🚂 NeurIPS 2020 Flatland Challenge - PPO RLlib Baseline¶

This Colab notebook allows you to train a full Flatland agent using the provided PPO baseline.

Read the documentation to learn how to make your first submission in 10 minutes: https://flatland.aicrowd.com/getting-started/first-submission.html

🔗 Flatland documentation
🔗 NeurIPS 2020 Challenge

📦 Setup¶

In [ ]:

!pip install tensorboard

In [ ]:

%%bash
## Setting up Conda Environment
MINICONDA_INSTALLER_SCRIPT=Miniconda3-4.5.4-Linux-x86_64.sh
MINICONDA_PREFIX=/usr/local
wget https://repo.continuum.io/miniconda/$MINICONDA_INSTALLER_SCRIPT
chmod +x $MINICONDA_INSTALLER_SCRIPT
./$MINICONDA_INSTALLER_SCRIPT -b -f -p $MINICONDA_PREFIX

In [ ]:

%%bash
# Install all packages for training
git clone http://gitlab.aicrowd.com/flatland/neurips2020-flatland-baselines.git

Cloning into 'neurips2020-flatland-baselines'...

In [ ]:

%cd neurips2020-flatland-baselines

/content/neurips2020-flatland-baselines/neurips2020-flatland-baselines

In [ ]:

%%bash

conda env create -n flatland-paper -f environment-cpu.yml

In [ ]:

%%bash
source activate flatland-paper
conda install -y ipykernel

Solving environment: ...working... done

# All requested packages already installed.

==> WARNING: A newer version of conda exists. <==
  current version: 4.5.4
  latest version: 4.8.5

Please update conda by running

    $ conda update -n base conda

🚂 Training¶

In [ ]:

%%bash
source activate flatland-paper
python train.py --help

-    Successfully Loaded Generator Config small_stoch_v1 from small_stoch_v1.yaml
-    Successfully Loaded Generator Config 32x32_v0 from 32x32_v0.yaml
-    Successfully Loaded Generator Config small_stoch_v0 from small_stoch_v0.yaml
-    Successfully Loaded Generator Config medium_stoch_v2 from medium_stoch_v2.yaml
-    Successfully Loaded Generator Config large_stoch_v0 from large_stoch_v0.yaml
-    Successfully Loaded Generator Config small_v0 from small_v0.yaml
-    Successfully Loaded Generator Config small_single_v0 from small_single_v0.yaml
-    Successfully Loaded Generator Config small_double_v0 from small_double_v0.yaml
-    Successfully Loaded Generator Config adrian_v0 from adrian_v0.yaml
-    Successfully Loaded Generator Config small_triple_v0 from small_triple_v0.yaml
-    Successfully Loaded Generator Config medium_stoch_v1 from medium_stoch_v1.yaml
-    Successfully Loaded Evaluation Config test_render from test_render.yaml
-    Successfully Loaded Evaluation Config default from default.yaml
-    Successfully Loaded Evaluation Config default_render from default_render.yaml
-    Successfully Loaded Evaluation Config enable_explore from enable_explore.yaml
-    Successfully Loaded Observation class TreeObs from tree_obs.py
-    Successfully Loaded Observation class LocalConflictObs from local_conflict_obs.py
-    Successfully Loaded Observation class Utils from utils.py
-    Successfully Loaded Observation class CombinedObs from combined_obs.py
-    Successfully Loaded Observation class ForwardActionObs from forward_action_obs.py
-    Successfully Loaded Observation class NewTreeObs from new_tree_obs.py
-    Successfully Loaded Observation class NewTreeObsBuilder from new_tree_obs_builder.py
-    Successfully Loaded Observation class RandomActionObs from random_action_obs.py
-    Successfully Loaded Observation class ShortestPathObs from shortest_path_obs.py
-    Successfully Loaded Observation class GlobalObs from global_obs.py
-    Successfully Loaded Observation class ShortestPathActionObs from shortest_path_action_obs.py
-    Successfully Loaded Observation class GlobalDensityObs from global_density_obs.py
-    Successfully Loaded Environment class FlatlandRandomSparseSmall from flatland_random_sparse_small.py
-    Successfully Loaded Environment class FlatlandBase from flatland_base.py
-    Successfully Loaded Environment class FlatlandSingle from flatland_single.py
-    Successfully Loaded Environment class FlatlandSparse from flatland_sparse.py
-    Successfully Loaded Model class CustomLossModel from custom_loss_model.py
-    Successfully Loaded Model class GlobalDensObsModel from global_dens_obs_model.py
-    Successfully Loaded Model class CcTransformer from cc_transformer.py
-    Successfully Loaded Model class CcConcatenate from cc_concatenate.py
-    Successfully Loaded Model class FullyConnectedModel from fully_connected_model.py
-    Successfully Loaded Model class GlobalObsModel from global_obs_model.py
usage: train.py [-h] [--run RUN] [--stop STOP] [--config CONFIG]
                [--resources-per-trial RESOURCES_PER_TRIAL]
                [--num-samples NUM_SAMPLES]
                [--checkpoint-freq CHECKPOINT_FREQ] [--checkpoint-at-end]
                [--no-sync-on-checkpoint]
                [--keep-checkpoints-num KEEP_CHECKPOINTS_NUM]
                [--checkpoint-score-attr CHECKPOINT_SCORE_ATTR]
                [--export-formats EXPORT_FORMATS]
                [--max-failures MAX_FAILURES] [--scheduler SCHEDULER]
                [--scheduler-config SCHEDULER_CONFIG] [--restore RESTORE]
                [--ray-address RAY_ADDRESS] [--ray-num-cpus RAY_NUM_CPUS]
                [--ray-num-gpus RAY_NUM_GPUS] [--ray-num-nodes RAY_NUM_NODES]
                [--ray-redis-max-memory RAY_REDIS_MAX_MEMORY]
                [--ray-memory RAY_MEMORY]
                [--ray-object-store-memory RAY_OBJECT_STORE_MEMORY]
                [--experiment-name EXPERIMENT_NAME] [--local-dir LOCAL_DIR]
                [--upload-dir UPLOAD_DIR] [-v] [-vv] [--resume] [--torch]
                [--eager] [--trace] [--log-flatland-stats] [-e] [-i] [-r] [-s]
                [--bind-all] [--env ENV] [--queue-trials] [-f CONFIG_FILE]

Train a reinforcement learning agent.

optional arguments:
  -h, --help            show this help message and exit
  --run RUN             The algorithm or model to train. This may refer to the
                        name of a built-on algorithm (e.g. RLLib's DQN or
                        PPO), or a user-defined trainable function or class
                        registered in the tune registry.
  --stop STOP           The stopping criteria, specified in JSON. The keys may
                        be any field returned by 'train()' e.g.
                        '{"time_total_s": 600, "training_iteration": 100000}'
                        to stop after 600 seconds or 100k iterations,
                        whichever is reached first.
  --config CONFIG       Algorithm-specific configuration (e.g. env,
                        hyperparams), specified in JSON.
  --resources-per-trial RESOURCES_PER_TRIAL
                        Override the machine resources to allocate per trial,
                        e.g. '{"cpu": 64, "gpu": 8}'. Note that GPUs will not
                        be assigned unless you specify them here. For RLlib,
                        you probably want to leave this alone and use RLlib
                        configs to control parallelism.
  --num-samples NUM_SAMPLES
                        Number of times to repeat each trial.
  --checkpoint-freq CHECKPOINT_FREQ
                        How many training iterations between checkpoints. A
                        value of 0 (default) disables checkpointing.
  --checkpoint-at-end   Whether to checkpoint at the end of the experiment.
                        Default is False.
  --no-sync-on-checkpoint
                        Disable sync-down of trial checkpoint, which is
                        enabled by default to guarantee recoverability. If
                        set, checkpoint syncing from worker to driver is
                        asynchronous. Set this only if synchronous
                        checkpointing is too slow and trial restoration
                        failures can be tolerated
  --keep-checkpoints-num KEEP_CHECKPOINTS_NUM
                        Number of best checkpoints to keep. Others get
                        deleted. Default (None) keeps all checkpoints.
  --checkpoint-score-attr CHECKPOINT_SCORE_ATTR
                        Specifies by which attribute to rank the best
                        checkpoint. Default is increasing order. If attribute
                        starts with min- it will rank attribute in decreasing
                        order. Example: min-validation_loss
  --export-formats EXPORT_FORMATS
                        List of formats that exported at the end of the
                        experiment. Default is None. For RLlib, 'checkpoint'
                        and 'model' are supported for TensorFlow policy
                        graphs.
  --max-failures MAX_FAILURES
                        Try to recover a trial from its last checkpoint at
                        least this many times. Only applies if checkpointing
                        is enabled.
  --scheduler SCHEDULER
                        FIFO (default), MedianStopping, AsyncHyperBand,
                        HyperBand, or HyperOpt.
  --scheduler-config SCHEDULER_CONFIG
                        Config options to pass to the scheduler.
  --restore RESTORE     If specified, restore from this checkpoint.
  --ray-address RAY_ADDRESS
                        Connect to an existing Ray cluster at this address
                        instead of starting a new one.
  --ray-num-cpus RAY_NUM_CPUS
                        --num-cpus to use if starting a new cluster.
  --ray-num-gpus RAY_NUM_GPUS
                        --num-gpus to use if starting a new cluster.
  --ray-num-nodes RAY_NUM_NODES
                        Emulate multiple cluster nodes for debugging.
  --ray-redis-max-memory RAY_REDIS_MAX_MEMORY
                        --redis-max-memory to use if starting a new cluster.
  --ray-memory RAY_MEMORY
                        --memory to use if starting a new cluster.
  --ray-object-store-memory RAY_OBJECT_STORE_MEMORY
                        --object-store-memory to use if starting a new
                        cluster.
  --experiment-name EXPERIMENT_NAME
                        Name of the subdirectory under `local_dir` to put
                        results in.
  --local-dir LOCAL_DIR
                        Local dir to save training results to. Defaults to
                        '/root/ray_results'.
  --upload-dir UPLOAD_DIR
                        Optional URI to sync training results to (e.g.
                        s3://bucket).
  -v                    Whether to use INFO level logging.
  -vv                   Whether to use DEBUG level logging.
  --resume              Whether to attempt to resume previous Tune
                        experiments.
  --torch               Whether to use PyTorch (instead of tf) as the DL
                        framework.
  --eager               Whether to attempt to enable TF eager execution.
  --trace               Whether to attempt to enable tracing for eager mode.
  --log-flatland-stats  Whether to log additional flatland specfic metrics
                        such as percentage complete or normalized score.
  -e, --eval            Whether to run evaluation. Default evaluation config
                        is default.yaml to use custom evaluation config set
                        (eval_generator:test_eval) under configs
  -i, --custom-fn       Whether the experiment uses a custom function for
                        trainingDefault custom function is
                        imitation_ppo_train_fn
  -r, --record          Whether the experiment requires video recording during
                        evaluationDefault evaluation config is
                        default_render.yaml Can also be done via custom
                        evaluation config set (eval_generator:test_render)
                        under configs
  -s, --save-checkpoint
                        Whether the experiment will save the checkpoints to
                        weights and biases
  --bind-all            Whether to expose on network (binding on all network
                        interfaces).
  --env ENV             The gym environment to use.
  --queue-trials        Whether to queue trials when the cluster does not
                        currently have enough resources to launch one. This
                        should be set to True when running on an autoscaling
                        cluster to enable automatic scale-up.
  -f CONFIG_FILE, --config-file CONFIG_FILE
                        If specified, use config options from this file. Note
                        that this overrides any trial-specific options set via
                        flags above.

Training example:
    python ./train.py --run DQN --env CartPole-v0 --no-log-flatland-stats

Training with Config:
    python ./train.py -f experiments/flatland_random_sparse_small/global_obs/ppo.yaml

Note that -f overrides all other trial-specific command-line options.

In [ ]:

# Replace num_workers to match current system. Number of workers should be atmost #CPU Cores - 1
# If doing evaluation, we should further reduce the workers by the number of evaluation workers
!sed 's/num_workers: 2/num_workers: 1/g' experiments/tests/global_obs_ppo.yaml > global_obs_ppo.yaml

In [ ]:

%%bash
# Note the argument --bind-all is needed whenever we are running in Google Colab
source activate flatland-paper
# Try to run a small test to see if rllib training is working
python ./train.py -f global_obs_ppo.yaml --bind-all

-    Successfully Loaded Generator Config small_stoch_v1 from small_stoch_v1.yaml
-    Successfully Loaded Generator Config 32x32_v0 from 32x32_v0.yaml
-    Successfully Loaded Generator Config small_stoch_v0 from small_stoch_v0.yaml
-    Successfully Loaded Generator Config medium_stoch_v2 from medium_stoch_v2.yaml
-    Successfully Loaded Generator Config large_stoch_v0 from large_stoch_v0.yaml
-    Successfully Loaded Generator Config small_v0 from small_v0.yaml
-    Successfully Loaded Generator Config small_single_v0 from small_single_v0.yaml
-    Successfully Loaded Generator Config small_double_v0 from small_double_v0.yaml
-    Successfully Loaded Generator Config adrian_v0 from adrian_v0.yaml
-    Successfully Loaded Generator Config small_triple_v0 from small_triple_v0.yaml
-    Successfully Loaded Generator Config medium_stoch_v1 from medium_stoch_v1.yaml
-    Successfully Loaded Evaluation Config test_render from test_render.yaml
-    Successfully Loaded Evaluation Config default from default.yaml
-    Successfully Loaded Evaluation Config default_render from default_render.yaml
-    Successfully Loaded Evaluation Config enable_explore from enable_explore.yaml
-    Successfully Loaded Observation class TreeObs from tree_obs.py
-    Successfully Loaded Observation class LocalConflictObs from local_conflict_obs.py
-    Successfully Loaded Observation class Utils from utils.py
-    Successfully Loaded Observation class CombinedObs from combined_obs.py
-    Successfully Loaded Observation class ForwardActionObs from forward_action_obs.py
-    Successfully Loaded Observation class NewTreeObs from new_tree_obs.py
-    Successfully Loaded Observation class NewTreeObsBuilder from new_tree_obs_builder.py
-    Successfully Loaded Observation class RandomActionObs from random_action_obs.py
-    Successfully Loaded Observation class ShortestPathObs from shortest_path_obs.py
-    Successfully Loaded Observation class GlobalObs from global_obs.py
-    Successfully Loaded Observation class ShortestPathActionObs from shortest_path_action_obs.py
-    Successfully Loaded Observation class GlobalDensityObs from global_density_obs.py
-    Successfully Loaded Environment class FlatlandRandomSparseSmall from flatland_random_sparse_small.py
-    Successfully Loaded Environment class FlatlandBase from flatland_base.py
-    Successfully Loaded Environment class FlatlandSingle from flatland_single.py
-    Successfully Loaded Environment class FlatlandSparse from flatland_sparse.py
-    Successfully Loaded Model class CustomLossModel from custom_loss_model.py
-    Successfully Loaded Model class GlobalDensObsModel from global_dens_obs_model.py
-    Successfully Loaded Model class CcTransformer from cc_transformer.py
-    Successfully Loaded Model class CcConcatenate from cc_concatenate.py
-    Successfully Loaded Model class FullyConnectedModel from fully_connected_model.py
-    Successfully Loaded Model class GlobalObsModel from global_obs_model.py
== Status ==
Memory usage on this node: 1.4/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.49 GiB objects
Result logdir: /root/ray_results/flatland-sparse-global-conv-ppo
Number of trials: 1 (1 RUNNING)
+---------------------------+----------+-------+
| Trial name                | status   | loc   |
|---------------------------+----------+-------|
| PPO_flatland_sparse_00000 | RUNNING  |       |
+---------------------------+----------+-------+


(pid=10684) 2020-10-10 10:27:52,884	INFO trainer.py:421 -- Tip: set 'eager': true or the --eager flag to enable TensorFlow eager execution
(pid=10684) -    Successfully Loaded Generator Config small_stoch_v1 from small_stoch_v1.yaml
(pid=10684) -    Successfully Loaded Generator Config 32x32_v0 from 32x32_v0.yaml
(pid=10684) -    Successfully Loaded Generator Config small_stoch_v0 from small_stoch_v0.yaml
(pid=10684) -    Successfully Loaded Generator Config medium_stoch_v2 from medium_stoch_v2.yaml
(pid=10684) -    Successfully Loaded Generator Config large_stoch_v0 from large_stoch_v0.yaml
(pid=10684) -    Successfully Loaded Generator Config small_v0 from small_v0.yaml
(pid=10684) -    Successfully Loaded Generator Config small_single_v0 from small_single_v0.yaml
(pid=10684) -    Successfully Loaded Generator Config small_double_v0 from small_double_v0.yaml
(pid=10684) -    Successfully Loaded Generator Config adrian_v0 from adrian_v0.yaml
(pid=10684) -    Successfully Loaded Generator Config small_triple_v0 from small_triple_v0.yaml
(pid=10684) -    Successfully Loaded Generator Config medium_stoch_v1 from medium_stoch_v1.yaml
(pid=10684) -    Successfully Loaded Evaluation Config test_render from test_render.yaml
(pid=10684) -    Successfully Loaded Evaluation Config default from default.yaml
(pid=10684) -    Successfully Loaded Evaluation Config default_render from default_render.yaml
(pid=10684) -    Successfully Loaded Evaluation Config enable_explore from enable_explore.yaml
(pid=10684) 2020-10-10 10:27:53,565	WARNING deprecation.py:30 -- DeprecationWarning: `callbacks dict interface` has been deprecated. Use `a class extending rllib.agents.callbacks.DefaultCallbacks` instead. This will raise an error in the future!
(pid=10684) 2020-10-10 10:27:53,565	INFO trainer.py:580 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
(pid=10684) 2020-10-10 10:27:53,566	WARNING deprecation.py:30 -- DeprecationWarning: `callbacks dict interface` has been deprecated. Use `a class extending rllib.agents.callbacks.DefaultCallbacks` instead. This will raise an error in the future!
(pid=10684) /usr/local/envs/flatland-paper/lib/python3.7/site-packages/flatland/envs/rail_generators.py:725: UserWarning: Could not set all required cities!
(pid=10684)   "Could not set all required cities!")
(pid=10684) -    Successfully Loaded Observation class TreeObs from tree_obs.py
(pid=10684) -    Successfully Loaded Observation class LocalConflictObs from local_conflict_obs.py
(pid=10684) -    Successfully Loaded Observation class Utils from utils.py
(pid=10684) -    Successfully Loaded Observation class CombinedObs from combined_obs.py
(pid=10684) -    Successfully Loaded Observation class ForwardActionObs from forward_action_obs.py
(pid=10684) -    Successfully Loaded Observation class NewTreeObs from new_tree_obs.py
(pid=10684) -    Successfully Loaded Observation class NewTreeObsBuilder from new_tree_obs_builder.py
(pid=10684) -    Successfully Loaded Observation class RandomActionObs from random_action_obs.py
(pid=10684) -    Successfully Loaded Observation class ShortestPathObs from shortest_path_obs.py
(pid=10684) -    Successfully Loaded Observation class GlobalObs from global_obs.py
(pid=10684) -    Successfully Loaded Observation class ShortestPathActionObs from shortest_path_action_obs.py
(pid=10684) -    Successfully Loaded Observation class GlobalDensityObs from global_density_obs.py
(pid=10684) ==================================================
(pid=10684) {'grid_mode': False,
(pid=10684)  'height': 25,
(pid=10684)  'max_num_cities': 4,
(pid=10684)  'max_rails_between_cities': 2,
(pid=10684)  'max_rails_in_city': 3,
(pid=10684)  'number_of_agents': 5,
(pid=10684)  'regenerate_rail_on_reset': True,
(pid=10684)  'regenerate_schedule_on_reset': True,
(pid=10684)  'seed': 0,
(pid=10684)  'width': 25}
(pid=10684) ==================================================
(pid=10684) /usr/local/envs/flatland-paper/lib/python3.7/site-packages/flatland/envs/rail_generators.py:725: UserWarning: Could not set all required cities!
(pid=10684)   "Could not set all required cities!")
(pid=10683) -    Successfully Loaded Generator Config small_stoch_v1 from small_stoch_v1.yaml
(pid=10683) -    Successfully Loaded Generator Config 32x32_v0 from 32x32_v0.yaml
(pid=10683) -    Successfully Loaded Generator Config small_stoch_v0 from small_stoch_v0.yaml
(pid=10683) -    Successfully Loaded Generator Config medium_stoch_v2 from medium_stoch_v2.yaml
(pid=10683) -    Successfully Loaded Generator Config large_stoch_v0 from large_stoch_v0.yaml
(pid=10683) -    Successfully Loaded Generator Config small_v0 from small_v0.yaml
(pid=10683) -    Successfully Loaded Generator Config small_single_v0 from small_single_v0.yaml
(pid=10683) -    Successfully Loaded Generator Config small_double_v0 from small_double_v0.yaml
(pid=10683) -    Successfully Loaded Generator Config adrian_v0 from adrian_v0.yaml
(pid=10683) -    Successfully Loaded Generator Config small_triple_v0 from small_triple_v0.yaml
(pid=10683) -    Successfully Loaded Generator Config medium_stoch_v1 from medium_stoch_v1.yaml
(pid=10683) -    Successfully Loaded Evaluation Config test_render from test_render.yaml
(pid=10683) -    Successfully Loaded Evaluation Config default from default.yaml
(pid=10683) -    Successfully Loaded Evaluation Config default_render from default_render.yaml
(pid=10683) -    Successfully Loaded Evaluation Config enable_explore from enable_explore.yaml
(pid=10683) 2020-10-10 10:28:00,849	WARNING deprecation.py:30 -- DeprecationWarning: `callbacks dict interface` has been deprecated. Use `a class extending rllib.agents.callbacks.DefaultCallbacks` instead. This will raise an error in the future!
(pid=10683) -    Successfully Loaded Observation class TreeObs from tree_obs.py
(pid=10683) -    Successfully Loaded Observation class LocalConflictObs from local_conflict_obs.py
(pid=10683) -    Successfully Loaded Observation class Utils from utils.py
(pid=10683) -    Successfully Loaded Observation class CombinedObs from combined_obs.py
(pid=10683) -    Successfully Loaded Observation class ForwardActionObs from forward_action_obs.py
(pid=10683) -    Successfully Loaded Observation class NewTreeObs from new_tree_obs.py
(pid=10683) -    Successfully Loaded Observation class NewTreeObsBuilder from new_tree_obs_builder.py
(pid=10683) -    Successfully Loaded Observation class RandomActionObs from random_action_obs.py
(pid=10683) -    Successfully Loaded Observation class ShortestPathObs from shortest_path_obs.py
(pid=10683) -    Successfully Loaded Observation class GlobalObs from global_obs.py
(pid=10683) -    Successfully Loaded Observation class ShortestPathActionObs from shortest_path_action_obs.py
(pid=10683) -    Successfully Loaded Observation class GlobalDensityObs from global_density_obs.py
(pid=10683) /usr/local/envs/flatland-paper/lib/python3.7/site-packages/flatland/envs/rail_generators.py:725: UserWarning: Could not set all required cities!
(pid=10683)   "Could not set all required cities!")
(pid=10683) /usr/local/envs/flatland-paper/lib/python3.7/site-packages/flatland/envs/rail_generators.py:647: UserWarning: [WARNING] Changing to Grid mode to place at least 2 cities.
(pid=10683)   warnings.warn("[WARNING] Changing to Grid mode to place at least 2 cities.")
(pid=10684) 2020-10-10 10:28:02,186	INFO trainable.py:217 -- Getting current IP.
(pid=10683) /usr/local/envs/flatland-paper/lib/python3.7/site-packages/flatland/envs/rail_generators.py:725: UserWarning: Could not set all required cities!
(pid=10683)   "Could not set all required cities!")
==================================================
Setting up new w&b logger
Experiment tag: 0
Experiment id: cf44a94683f74f69bac106522ff86af0
==================================================
== Status ==
Memory usage on this node: 2.4/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.49 GiB objects
Result logdir: /root/ray_results/flatland-sparse-global-conv-ppo
Number of trials: 1 (1 RUNNING)
+---------------------------+----------+------------------+--------+------------------+------+----------+
| Trial name                | status   | loc              |   iter |   total time (s) |   ts |   reward |
|---------------------------+----------+------------------+--------+------------------+------+----------|
| PPO_flatland_sparse_00000 | RUNNING  | 172.28.0.2:10684 |      1 |          64.9726 | 1114 |      nan |
+---------------------------+----------+------------------+--------+------------------+------+----------+


== Status ==
Memory usage on this node: 2.4/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.49 GiB objects
Result logdir: /root/ray_results/flatland-sparse-global-conv-ppo
Number of trials: 1 (1 TERMINATED)
+---------------------------+------------+-------+--------+------------------+------+----------+
| Trial name                | status     | loc   |   iter |   total time (s) |   ts |   reward |
|---------------------------+------------+-------+--------+------------------+------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |       |      1 |          64.9726 | 1114 |      nan |
+---------------------------+------------+-------+--------+------------------+------+----------+


Saving full experiment config: global_obs_ppo.yaml

2020-10-10 10:27:47,573	INFO resource_spec.py:212 -- Starting Ray with 7.32 GiB memory available for workers and up to 3.68 GiB for objects. You can adjust these settings with ray.init(memory=<bytes>, object_store_memory=<bytes>).
2020-10-10 10:27:48,031	INFO services.py:1170 -- View the Ray dashboard at 172.28.0.2:8265
2020-10-10 10:27:48,381	WARNING logger.py:314 -- JsonLogger not provided. The ExperimentAnalysis tool is disabled.
wandb: W&B is a tool that helps track and visualize machine learning experiments
wandb: No credentials found.  Run "wandb login" to visualize your metrics
wandb: Tracking run with wandb version 0.9.2
wandb: Wandb version 0.10.5 is available!  To upgrade, please run:
wandb:  $ pip install wandb --upgrade
wandb: Run data is saved locally in wandb/run-20201010_102907-20la2orq

wandb: Program ended successfully.
wandb: You can sync this run to the cloud by running: 
wandb: wandb sync wandb/run-20201010_102907-20la2orq

📈 TensorBoard¶

In [ ]:

%load_ext tensorboard

In [ ]:

%tensorboard --logdir ~/ray_results

Output hidden; open in https://colab.research.google.com to view.

In [ ]:

# Now we can run a full training after changing number of workers
# For demonstration purpose we also reduce the time steps to 15000
%%bash

sed 's/num_workers: 13/num_workers: 1/g' baselines/action_masking_and_skipping/ppo_tree_obs_small_v0.yaml \
| sed 's/num_envs_per_worker: 5/num_envs_per_worker: 2/g' \
| sed 's/timesteps_total: 15000000/timesteps_total: 15000/g' > ppo-tree-obs-small-v0.yaml

In [ ]:

%%bash
# Note the argument --bind-all is needed whenever we are running in Google Colab
source activate flatland-paper
# Run rllib training
python ./train.py -f ppo-tree-obs-small-v0.yaml --bind-all

-    Successfully Loaded Generator Config small_stoch_v1 from small_stoch_v1.yaml
-    Successfully Loaded Generator Config 32x32_v0 from 32x32_v0.yaml
-    Successfully Loaded Generator Config small_stoch_v0 from small_stoch_v0.yaml
-    Successfully Loaded Generator Config medium_stoch_v2 from medium_stoch_v2.yaml
-    Successfully Loaded Generator Config large_stoch_v0 from large_stoch_v0.yaml
-    Successfully Loaded Generator Config small_v0 from small_v0.yaml
-    Successfully Loaded Generator Config small_single_v0 from small_single_v0.yaml
-    Successfully Loaded Generator Config small_double_v0 from small_double_v0.yaml
-    Successfully Loaded Generator Config adrian_v0 from adrian_v0.yaml
-    Successfully Loaded Generator Config small_triple_v0 from small_triple_v0.yaml
-    Successfully Loaded Generator Config medium_stoch_v1 from medium_stoch_v1.yaml
-    Successfully Loaded Evaluation Config test_render from test_render.yaml
-    Successfully Loaded Evaluation Config default from default.yaml
-    Successfully Loaded Evaluation Config default_render from default_render.yaml
-    Successfully Loaded Evaluation Config enable_explore from enable_explore.yaml
-    Successfully Loaded Observation class TreeObs from tree_obs.py
-    Successfully Loaded Observation class LocalConflictObs from local_conflict_obs.py
-    Successfully Loaded Observation class Utils from utils.py
-    Successfully Loaded Observation class CombinedObs from combined_obs.py
-    Successfully Loaded Observation class ForwardActionObs from forward_action_obs.py
-    Successfully Loaded Observation class NewTreeObs from new_tree_obs.py
-    Successfully Loaded Observation class NewTreeObsBuilder from new_tree_obs_builder.py
-    Successfully Loaded Observation class RandomActionObs from random_action_obs.py
-    Successfully Loaded Observation class ShortestPathObs from shortest_path_obs.py
-    Successfully Loaded Observation class GlobalObs from global_obs.py
-    Successfully Loaded Observation class ShortestPathActionObs from shortest_path_action_obs.py
-    Successfully Loaded Observation class GlobalDensityObs from global_density_obs.py
-    Successfully Loaded Environment class FlatlandRandomSparseSmall from flatland_random_sparse_small.py
-    Successfully Loaded Environment class FlatlandBase from flatland_base.py
-    Successfully Loaded Environment class FlatlandSingle from flatland_single.py
-    Successfully Loaded Environment class FlatlandSparse from flatland_sparse.py
-    Successfully Loaded Model class CustomLossModel from custom_loss_model.py
-    Successfully Loaded Model class GlobalDensObsModel from global_dens_obs_model.py
-    Successfully Loaded Model class CcTransformer from cc_transformer.py
-    Successfully Loaded Model class CcConcatenate from cc_concatenate.py
-    Successfully Loaded Model class FullyConnectedModel from fully_connected_model.py
-    Successfully Loaded Model class GlobalObsModel from global_obs_model.py
== Status ==
Memory usage on this node: 1.4/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (2 PENDING, 1 RUNNING)
+---------------------------+----------+-------+
| Trial name                | status   | loc   |
|---------------------------+----------+-------|
| PPO_flatland_sparse_00000 | RUNNING  |       |
| PPO_flatland_sparse_00001 | PENDING  |       |
| PPO_flatland_sparse_00002 | PENDING  |       |
+---------------------------+----------+-------+


(pid=11266) -    Successfully Loaded Generator Config small_stoch_v1 from small_stoch_v1.yaml
(pid=11266) 2020-10-10 10:49:14,547	INFO trainer.py:421 -- Tip: set 'eager': true or the --eager flag to enable TensorFlow eager execution
(pid=11266) -    Successfully Loaded Generator Config 32x32_v0 from 32x32_v0.yaml
(pid=11266) -    Successfully Loaded Generator Config small_stoch_v0 from small_stoch_v0.yaml
(pid=11266) -    Successfully Loaded Generator Config medium_stoch_v2 from medium_stoch_v2.yaml
(pid=11266) -    Successfully Loaded Generator Config large_stoch_v0 from large_stoch_v0.yaml
(pid=11266) -    Successfully Loaded Generator Config small_v0 from small_v0.yaml
(pid=11266) -    Successfully Loaded Generator Config small_single_v0 from small_single_v0.yaml
(pid=11266) -    Successfully Loaded Generator Config small_double_v0 from small_double_v0.yaml
(pid=11266) -    Successfully Loaded Generator Config adrian_v0 from adrian_v0.yaml
(pid=11266) -    Successfully Loaded Generator Config small_triple_v0 from small_triple_v0.yaml
(pid=11266) -    Successfully Loaded Generator Config medium_stoch_v1 from medium_stoch_v1.yaml
(pid=11266) -    Successfully Loaded Evaluation Config test_render from test_render.yaml
(pid=11266) -    Successfully Loaded Evaluation Config default from default.yaml
(pid=11266) -    Successfully Loaded Evaluation Config default_render from default_render.yaml
(pid=11266) -    Successfully Loaded Evaluation Config enable_explore from enable_explore.yaml
(pid=11266) -    Successfully Loaded Observation class TreeObs from tree_obs.py
(pid=11266) -    Successfully Loaded Observation class LocalConflictObs from local_conflict_obs.py
(pid=11266) -    Successfully Loaded Observation class Utils from utils.py
(pid=11266) -    Successfully Loaded Observation class CombinedObs from combined_obs.py
(pid=11266) -    Successfully Loaded Observation class ForwardActionObs from forward_action_obs.py
(pid=11266) -    Successfully Loaded Observation class NewTreeObs from new_tree_obs.py
(pid=11266) -    Successfully Loaded Observation class NewTreeObsBuilder from new_tree_obs_builder.py
(pid=11266) -    Successfully Loaded Observation class RandomActionObs from random_action_obs.py
(pid=11266) -    Successfully Loaded Observation class ShortestPathObs from shortest_path_obs.py
(pid=11266) -    Successfully Loaded Observation class GlobalObs from global_obs.py
(pid=11266) -    Successfully Loaded Observation class ShortestPathActionObs from shortest_path_action_obs.py
(pid=11266) -    Successfully Loaded Observation class GlobalDensityObs from global_density_obs.py
(pid=11266) ==================================================
(pid=11266) {'grid_mode': False,
(pid=11266)  'height': 25,
(pid=11266)  'max_num_cities': 4,
(pid=11266)  'max_rails_between_cities': 2,
(pid=11266)  'max_rails_in_city': 3,
(pid=11266)  'number_of_agents': 5,
(pid=11266)  'regenerate_rail_on_reset': True,
(pid=11266)  'regenerate_schedule_on_reset': True,
(pid=11266)  'seed': 0,
(pid=11266)  'width': 25}
(pid=11266) ==================================================
(pid=11266) 2020-10-10 10:49:15,185	WARNING deprecation.py:30 -- DeprecationWarning: `callbacks dict interface` has been deprecated. Use `a class extending rllib.agents.callbacks.DefaultCallbacks` instead. This will raise an error in the future!
(pid=11266) 2020-10-10 10:49:15,185	INFO trainer.py:580 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
(pid=11266) 2020-10-10 10:49:15,187	WARNING deprecation.py:30 -- DeprecationWarning: `callbacks dict interface` has been deprecated. Use `a class extending rllib.agents.callbacks.DefaultCallbacks` instead. This will raise an error in the future!
(pid=11266) /usr/local/envs/flatland-paper/lib/python3.7/site-packages/flatland/envs/rail_generators.py:725: UserWarning: Could not set all required cities!
(pid=11266)   "Could not set all required cities!")
(pid=11266) 2020-10-10 10:49:19,670	INFO trainable.py:217 -- Getting current IP.
(pid=11268) -    Successfully Loaded Generator Config small_stoch_v1 from small_stoch_v1.yaml
(pid=11268) -    Successfully Loaded Generator Config 32x32_v0 from 32x32_v0.yaml
(pid=11268) -    Successfully Loaded Generator Config small_stoch_v0 from small_stoch_v0.yaml
(pid=11268) -    Successfully Loaded Generator Config medium_stoch_v2 from medium_stoch_v2.yaml
(pid=11268) -    Successfully Loaded Generator Config large_stoch_v0 from large_stoch_v0.yaml
(pid=11268) -    Successfully Loaded Generator Config small_v0 from small_v0.yaml
(pid=11268) -    Successfully Loaded Generator Config small_single_v0 from small_single_v0.yaml
(pid=11268) -    Successfully Loaded Generator Config small_double_v0 from small_double_v0.yaml
(pid=11268) -    Successfully Loaded Generator Config adrian_v0 from adrian_v0.yaml
(pid=11268) -    Successfully Loaded Generator Config small_triple_v0 from small_triple_v0.yaml
(pid=11268) -    Successfully Loaded Generator Config medium_stoch_v1 from medium_stoch_v1.yaml
(pid=11268) -    Successfully Loaded Evaluation Config test_render from test_render.yaml
(pid=11268) -    Successfully Loaded Evaluation Config default from default.yaml
(pid=11268) -    Successfully Loaded Evaluation Config default_render from default_render.yaml
(pid=11268) -    Successfully Loaded Evaluation Config enable_explore from enable_explore.yaml
(pid=11268) 2020-10-10 10:49:20,914	WARNING deprecation.py:30 -- DeprecationWarning: `callbacks dict interface` has been deprecated. Use `a class extending rllib.agents.callbacks.DefaultCallbacks` instead. This will raise an error in the future!
(pid=11268) /usr/local/envs/flatland-paper/lib/python3.7/site-packages/flatland/envs/rail_generators.py:725: UserWarning: Could not set all required cities!
(pid=11268)   "Could not set all required cities!")
(pid=11268) /usr/local/envs/flatland-paper/lib/python3.7/site-packages/flatland/envs/rail_generators.py:647: UserWarning: [WARNING] Changing to Grid mode to place at least 2 cities.
(pid=11268)   warnings.warn("[WARNING] Changing to Grid mode to place at least 2 cities.")
(pid=11268) -    Successfully Loaded Observation class TreeObs from tree_obs.py
(pid=11268) -    Successfully Loaded Observation class LocalConflictObs from local_conflict_obs.py
(pid=11268) -    Successfully Loaded Observation class Utils from utils.py
(pid=11268) -    Successfully Loaded Observation class CombinedObs from combined_obs.py
(pid=11268) -    Successfully Loaded Observation class ForwardActionObs from forward_action_obs.py
(pid=11268) -    Successfully Loaded Observation class NewTreeObs from new_tree_obs.py
(pid=11268) -    Successfully Loaded Observation class NewTreeObsBuilder from new_tree_obs_builder.py
(pid=11268) -    Successfully Loaded Observation class RandomActionObs from random_action_obs.py
(pid=11268) -    Successfully Loaded Observation class ShortestPathObs from shortest_path_obs.py
(pid=11268) -    Successfully Loaded Observation class GlobalObs from global_obs.py
(pid=11268) -    Successfully Loaded Observation class ShortestPathActionObs from shortest_path_action_obs.py
(pid=11268) -    Successfully Loaded Observation class GlobalDensityObs from global_density_obs.py
(pid=11268) /usr/local/envs/flatland-paper/lib/python3.7/site-packages/flatland/envs/rail_generators.py:725: UserWarning: Could not set all required cities!
(pid=11268)   "Could not set all required cities!")
==================================================
Setting up new w&b logger
Experiment tag: 0
Experiment id: 3f9c63a2f09e4e9dbbef0efcac0dca05
==================================================
== Status ==
Memory usage on this node: 1.9/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (2 PENDING, 1 RUNNING)
+---------------------------+----------+------------------+--------+------------------+------+----------+
| Trial name                | status   | loc              |   iter |   total time (s) |   ts |   reward |
|---------------------------+----------+------------------+--------+------------------+------+----------|
| PPO_flatland_sparse_00000 | RUNNING  | 172.28.0.2:11266 |      1 |          14.1453 | 1003 |      nan |
| PPO_flatland_sparse_00001 | PENDING  |                  |        |                  |      |          |
| PPO_flatland_sparse_00002 | PENDING  |                  |        |                  |      |          |
+---------------------------+----------+------------------+--------+------------------+------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (2 PENDING, 1 RUNNING)
+---------------------------+----------+------------------+--------+------------------+------+----------+
| Trial name                | status   | loc              |   iter |   total time (s) |   ts |   reward |
|---------------------------+----------+------------------+--------+------------------+------+----------|
| PPO_flatland_sparse_00000 | RUNNING  | 172.28.0.2:11266 |      2 |          27.1881 | 2139 |      nan |
| PPO_flatland_sparse_00001 | PENDING  |                  |        |                  |      |          |
| PPO_flatland_sparse_00002 | PENDING  |                  |        |                  |      |          |
+---------------------------+----------+------------------+--------+------------------+------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (2 PENDING, 1 RUNNING)
+---------------------------+----------+------------------+--------+------------------+------+----------+
| Trial name                | status   | loc              |   iter |   total time (s) |   ts |   reward |
|---------------------------+----------+------------------+--------+------------------+------+----------|
| PPO_flatland_sparse_00000 | RUNNING  | 172.28.0.2:11266 |      3 |          38.2044 | 3336 |  -1263.5 |
| PPO_flatland_sparse_00001 | PENDING  |                  |        |                  |      |          |
| PPO_flatland_sparse_00002 | PENDING  |                  |        |                  |      |          |
+---------------------------+----------+------------------+--------+------------------+------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (2 PENDING, 1 RUNNING)
+---------------------------+----------+------------------+--------+------------------+------+----------+
| Trial name                | status   | loc              |   iter |   total time (s) |   ts |   reward |
|---------------------------+----------+------------------+--------+------------------+------+----------|
| PPO_flatland_sparse_00000 | RUNNING  | 172.28.0.2:11266 |      4 |          49.9066 | 4436 |  -1263.5 |
| PPO_flatland_sparse_00001 | PENDING  |                  |        |                  |      |          |
| PPO_flatland_sparse_00002 | PENDING  |                  |        |                  |      |          |
+---------------------------+----------+------------------+--------+------------------+------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (2 PENDING, 1 RUNNING)
+---------------------------+----------+------------------+--------+------------------+------+----------+
| Trial name                | status   | loc              |   iter |   total time (s) |   ts |   reward |
|---------------------------+----------+------------------+--------+------------------+------+----------|
| PPO_flatland_sparse_00000 | RUNNING  | 172.28.0.2:11266 |      5 |          58.5835 | 5586 |  -1263.5 |
| PPO_flatland_sparse_00001 | PENDING  |                  |        |                  |      |          |
| PPO_flatland_sparse_00002 | PENDING  |                  |        |                  |      |          |
+---------------------------+----------+------------------+--------+------------------+------+----------+


(pid=11268) /usr/local/envs/flatland-paper/lib/python3.7/site-packages/flatland/envs/rail_generators.py:647: UserWarning: [WARNING] Changing to Grid mode to place at least 2 cities.
(pid=11268)   warnings.warn("[WARNING] Changing to Grid mode to place at least 2 cities.")
== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (2 PENDING, 1 RUNNING)
+---------------------------+----------+------------------+--------+------------------+------+----------+
| Trial name                | status   | loc              |   iter |   total time (s) |   ts |   reward |
|---------------------------+----------+------------------+--------+------------------+------+----------|
| PPO_flatland_sparse_00000 | RUNNING  | 172.28.0.2:11266 |      6 |          70.4385 | 6753 |    -1573 |
| PPO_flatland_sparse_00001 | PENDING  |                  |        |                  |      |          |
| PPO_flatland_sparse_00002 | PENDING  |                  |        |                  |      |          |
+---------------------------+----------+------------------+--------+------------------+------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (2 PENDING, 1 RUNNING)
+---------------------------+----------+------------------+--------+------------------+------+----------+
| Trial name                | status   | loc              |   iter |   total time (s) |   ts |   reward |
|---------------------------+----------+------------------+--------+------------------+------+----------|
| PPO_flatland_sparse_00000 | RUNNING  | 172.28.0.2:11266 |      7 |          77.9675 | 7812 |    -1573 |
| PPO_flatland_sparse_00001 | PENDING  |                  |        |                  |      |          |
| PPO_flatland_sparse_00002 | PENDING  |                  |        |                  |      |          |
+---------------------------+----------+------------------+--------+------------------+------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (2 PENDING, 1 RUNNING)
+---------------------------+----------+------------------+--------+------------------+------+----------+
| Trial name                | status   | loc              |   iter |   total time (s) |   ts |   reward |
|---------------------------+----------+------------------+--------+------------------+------+----------|
| PPO_flatland_sparse_00000 | RUNNING  | 172.28.0.2:11266 |      8 |          88.1596 | 8895 |    -1573 |
| PPO_flatland_sparse_00001 | PENDING  |                  |        |                  |      |          |
| PPO_flatland_sparse_00002 | PENDING  |                  |        |                  |      |          |
+---------------------------+----------+------------------+--------+------------------+------+----------+

Saving full experiment config: ppo-tree-obs-small-v0.yaml

== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (2 PENDING, 1 RUNNING)
+---------------------------+----------+------------------+--------+------------------+-------+----------+
| Trial name                | status   | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+----------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | RUNNING  | 172.28.0.2:11266 |      9 |          98.9992 | 10120 | -1586.83 |
| PPO_flatland_sparse_00001 | PENDING  |                  |        |                  |       |          |
| PPO_flatland_sparse_00002 | PENDING  |                  |        |                  |       |          |
+---------------------------+----------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (2 PENDING, 1 RUNNING)
+---------------------------+----------+------------------+--------+------------------+-------+----------+
| Trial name                | status   | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+----------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | RUNNING  | 172.28.0.2:11266 |     10 |          106.389 | 11120 | -1586.83 |
| PPO_flatland_sparse_00001 | PENDING  |                  |        |                  |       |          |
| PPO_flatland_sparse_00002 | PENDING  |                  |        |                  |       |          |
+---------------------------+----------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (2 PENDING, 1 RUNNING)
+---------------------------+----------+------------------+--------+------------------+-------+----------+
| Trial name                | status   | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+----------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | RUNNING  | 172.28.0.2:11266 |     11 |          113.736 | 12120 | -1586.83 |
| PPO_flatland_sparse_00001 | PENDING  |                  |        |                  |       |          |
| PPO_flatland_sparse_00002 | PENDING  |                  |        |                  |       |          |
+---------------------------+----------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (2 PENDING, 1 RUNNING)
+---------------------------+----------+------------------+--------+------------------+-------+----------+
| Trial name                | status   | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+----------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | RUNNING  | 172.28.0.2:11266 |     12 |          121.131 | 13120 | -1586.83 |
| PPO_flatland_sparse_00001 | PENDING  |                  |        |                  |       |          |
| PPO_flatland_sparse_00002 | PENDING  |                  |        |                  |       |          |
+---------------------------+----------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (2 PENDING, 1 RUNNING)
+---------------------------+----------+------------------+--------+------------------+-------+----------+
| Trial name                | status   | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+----------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | RUNNING  | 172.28.0.2:11266 |     13 |          133.423 | 14322 | -1710.75 |
| PPO_flatland_sparse_00001 | PENDING  |                  |        |                  |       |          |
| PPO_flatland_sparse_00002 | PENDING  |                  |        |                  |       |          |
+---------------------------+----------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (2 PENDING, 1 RUNNING)
+---------------------------+----------+------------------+--------+------------------+-------+----------+
| Trial name                | status   | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+----------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | RUNNING  | 172.28.0.2:11266 |     14 |          145.959 | 15429 |    -1571 |
| PPO_flatland_sparse_00001 | PENDING  |                  |        |                  |       |          |
| PPO_flatland_sparse_00002 | PENDING  |                  |        |                  |       |          |
+---------------------------+----------+------------------+--------+------------------+-------+----------+


(pid=11622) 2020-10-10 10:51:51,815	INFO trainer.py:421 -- Tip: set 'eager': true or the --eager flag to enable TensorFlow eager execution
(pid=11622) -    Successfully Loaded Generator Config small_stoch_v1 from small_stoch_v1.yaml
(pid=11622) -    Successfully Loaded Generator Config 32x32_v0 from 32x32_v0.yaml
(pid=11622) -    Successfully Loaded Generator Config small_stoch_v0 from small_stoch_v0.yaml
(pid=11622) -    Successfully Loaded Generator Config medium_stoch_v2 from medium_stoch_v2.yaml
(pid=11622) -    Successfully Loaded Generator Config large_stoch_v0 from large_stoch_v0.yaml
(pid=11622) -    Successfully Loaded Generator Config small_v0 from small_v0.yaml
(pid=11622) -    Successfully Loaded Generator Config small_single_v0 from small_single_v0.yaml
(pid=11622) -    Successfully Loaded Generator Config small_double_v0 from small_double_v0.yaml
(pid=11622) -    Successfully Loaded Generator Config adrian_v0 from adrian_v0.yaml
(pid=11622) -    Successfully Loaded Generator Config small_triple_v0 from small_triple_v0.yaml
(pid=11622) -    Successfully Loaded Generator Config medium_stoch_v1 from medium_stoch_v1.yaml
(pid=11622) -    Successfully Loaded Evaluation Config test_render from test_render.yaml
(pid=11622) -    Successfully Loaded Evaluation Config default from default.yaml
(pid=11622) -    Successfully Loaded Evaluation Config default_render from default_render.yaml
(pid=11622) -    Successfully Loaded Evaluation Config enable_explore from enable_explore.yaml
(pid=11622) -    Successfully Loaded Observation class TreeObs from tree_obs.py
(pid=11622) -    Successfully Loaded Observation class LocalConflictObs from local_conflict_obs.py
(pid=11622) -    Successfully Loaded Observation class Utils from utils.py
(pid=11622) -    Successfully Loaded Observation class CombinedObs from combined_obs.py
(pid=11622) -    Successfully Loaded Observation class ForwardActionObs from forward_action_obs.py
(pid=11622) -    Successfully Loaded Observation class NewTreeObs from new_tree_obs.py
(pid=11622) -    Successfully Loaded Observation class NewTreeObsBuilder from new_tree_obs_builder.py
(pid=11622) -    Successfully Loaded Observation class RandomActionObs from random_action_obs.py
(pid=11622) -    Successfully Loaded Observation class ShortestPathObs from shortest_path_obs.py
(pid=11622) -    Successfully Loaded Observation class GlobalObs from global_obs.py
(pid=11622) -    Successfully Loaded Observation class ShortestPathActionObs from shortest_path_action_obs.py
(pid=11622) -    Successfully Loaded Observation class GlobalDensityObs from global_density_obs.py
(pid=11622) ==================================================
(pid=11622) {'grid_mode': False,
(pid=11622)  'height': 25,
(pid=11622)  'max_num_cities': 4,
(pid=11622)  'max_rails_between_cities': 2,
(pid=11622)  'max_rails_in_city': 3,
(pid=11622)  'number_of_agents': 5,
(pid=11622)  'regenerate_rail_on_reset': True,
(pid=11622)  'regenerate_schedule_on_reset': True,
(pid=11622)  'seed': 0,
(pid=11622)  'width': 25}
(pid=11622) ==================================================
(pid=11622) 2020-10-10 10:51:52,803	WARNING deprecation.py:30 -- DeprecationWarning: `callbacks dict interface` has been deprecated. Use `a class extending rllib.agents.callbacks.DefaultCallbacks` instead. This will raise an error in the future!
(pid=11622) 2020-10-10 10:51:52,803	INFO trainer.py:580 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
(pid=11622) 2020-10-10 10:51:52,805	WARNING deprecation.py:30 -- DeprecationWarning: `callbacks dict interface` has been deprecated. Use `a class extending rllib.agents.callbacks.DefaultCallbacks` instead. This will raise an error in the future!
(pid=11622) /usr/local/envs/flatland-paper/lib/python3.7/site-packages/flatland/envs/rail_generators.py:725: UserWarning: Could not set all required cities!
(pid=11622)   "Could not set all required cities!")
(pid=11622) /usr/local/envs/flatland-paper/lib/python3.7/site-packages/flatland/envs/rail_generators.py:725: UserWarning: Could not set all required cities!
(pid=11622)   "Could not set all required cities!")
(pid=11622) 2020-10-10 10:51:59,967	INFO trainable.py:217 -- Getting current IP.
(pid=11664) -    Successfully Loaded Generator Config small_stoch_v1 from small_stoch_v1.yaml
(pid=11664) -    Successfully Loaded Generator Config 32x32_v0 from 32x32_v0.yaml
(pid=11664) -    Successfully Loaded Generator Config small_stoch_v0 from small_stoch_v0.yaml
(pid=11664) -    Successfully Loaded Generator Config medium_stoch_v2 from medium_stoch_v2.yaml
(pid=11664) -    Successfully Loaded Generator Config large_stoch_v0 from large_stoch_v0.yaml
(pid=11664) -    Successfully Loaded Generator Config small_v0 from small_v0.yaml
(pid=11664) -    Successfully Loaded Generator Config small_single_v0 from small_single_v0.yaml
(pid=11664) -    Successfully Loaded Generator Config small_double_v0 from small_double_v0.yaml
(pid=11664) -    Successfully Loaded Generator Config adrian_v0 from adrian_v0.yaml
(pid=11664) -    Successfully Loaded Generator Config small_triple_v0 from small_triple_v0.yaml
(pid=11664) -    Successfully Loaded Generator Config medium_stoch_v1 from medium_stoch_v1.yaml
(pid=11664) -    Successfully Loaded Evaluation Config test_render from test_render.yaml
(pid=11664) -    Successfully Loaded Evaluation Config default from default.yaml
(pid=11664) -    Successfully Loaded Evaluation Config default_render from default_render.yaml
(pid=11664) -    Successfully Loaded Evaluation Config enable_explore from enable_explore.yaml
(pid=11664) -    Successfully Loaded Observation class TreeObs from tree_obs.py
(pid=11664) -    Successfully Loaded Observation class LocalConflictObs from local_conflict_obs.py
(pid=11664) -    Successfully Loaded Observation class Utils from utils.py
(pid=11664) -    Successfully Loaded Observation class CombinedObs from combined_obs.py
(pid=11664) -    Successfully Loaded Observation class ForwardActionObs from forward_action_obs.py
(pid=11664) -    Successfully Loaded Observation class NewTreeObs from new_tree_obs.py
(pid=11664) -    Successfully Loaded Observation class NewTreeObsBuilder from new_tree_obs_builder.py
(pid=11664) -    Successfully Loaded Observation class RandomActionObs from random_action_obs.py
(pid=11664) -    Successfully Loaded Observation class ShortestPathObs from shortest_path_obs.py
(pid=11664) -    Successfully Loaded Observation class GlobalObs from global_obs.py
(pid=11664) -    Successfully Loaded Observation class ShortestPathActionObs from shortest_path_action_obs.py
(pid=11664) -    Successfully Loaded Observation class GlobalDensityObs from global_density_obs.py
(pid=11664) 2020-10-10 10:52:02,606	WARNING deprecation.py:30 -- DeprecationWarning: `callbacks dict interface` has been deprecated. Use `a class extending rllib.agents.callbacks.DefaultCallbacks` instead. This will raise an error in the future!
(pid=11664) /usr/local/envs/flatland-paper/lib/python3.7/site-packages/flatland/envs/rail_generators.py:725: UserWarning: Could not set all required cities!
(pid=11664)   "Could not set all required cities!")
(pid=11664) /usr/local/envs/flatland-paper/lib/python3.7/site-packages/flatland/envs/rail_generators.py:725: UserWarning: Could not set all required cities!
(pid=11664)   "Could not set all required cities!")
==================================================
Setting up new w&b logger
Experiment tag: 1
Experiment id: b81a7c36e77446a4bbfeca416dfe720c
==================================================
== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |         145.959  | 15429 |    -1571 |
| PPO_flatland_sparse_00001 | RUNNING    | 172.28.0.2:11622 |      1 |          20.9635 |  1179 |      nan |
| PPO_flatland_sparse_00002 | PENDING    |                  |        |                  |       |          |
+---------------------------+------------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |         145.959  | 15429 |    -1571 |
| PPO_flatland_sparse_00001 | RUNNING    | 172.28.0.2:11622 |      2 |          32.0034 |  2229 |      nan |
| PPO_flatland_sparse_00002 | PENDING    |                  |        |                  |       |          |
+---------------------------+------------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |         145.959  | 15429 |  -1571   |
| PPO_flatland_sparse_00001 | RUNNING    | 172.28.0.2:11622 |      3 |          42.8717 |  3390 |  -1508.5 |
| PPO_flatland_sparse_00002 | PENDING    |                  |        |                  |       |          |
+---------------------------+------------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |         145.959  | 15429 |  -1571   |
| PPO_flatland_sparse_00001 | RUNNING    | 172.28.0.2:11622 |      4 |          50.5203 |  4390 |  -1508.5 |
| PPO_flatland_sparse_00002 | PENDING    |                  |        |                  |       |          |
+---------------------------+------------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |         145.959  | 15429 |  -1571   |
| PPO_flatland_sparse_00001 | RUNNING    | 172.28.0.2:11622 |      5 |          58.3617 |  5390 |  -1508.5 |
| PPO_flatland_sparse_00002 | PENDING    |                  |        |                  |       |          |
+---------------------------+------------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |         145.959  | 15429 |  -1571   |
| PPO_flatland_sparse_00001 | RUNNING    | 172.28.0.2:11622 |      6 |          66.0808 |  6390 |  -1508.5 |
| PPO_flatland_sparse_00002 | PENDING    |                  |        |                  |       |          |
+---------------------------+------------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |          145.959 | 15429 | -1571    |
| PPO_flatland_sparse_00001 | RUNNING    | 172.28.0.2:11622 |      7 |           77.469 |  7538 | -1786.75 |
| PPO_flatland_sparse_00002 | PENDING    |                  |        |                  |       |          |
+---------------------------+------------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |         145.959  | 15429 | -1571    |
| PPO_flatland_sparse_00001 | RUNNING    | 172.28.0.2:11622 |      8 |          88.3345 |  8646 | -1786.75 |
| PPO_flatland_sparse_00002 | PENDING    |                  |        |                  |       |          |
+---------------------------+------------+------------------+--------+------------------+-------+----------+

Saving full experiment config: ppo-tree-obs-small-v0.yaml

== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |          145.959 | 15429 |    -1571 |
| PPO_flatland_sparse_00001 | RUNNING    | 172.28.0.2:11622 |      9 |          101.621 |  9739 |    -1601 |
| PPO_flatland_sparse_00002 | PENDING    |                  |        |                  |       |          |
+---------------------------+------------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |          145.959 | 15429 |    -1571 |
| PPO_flatland_sparse_00001 | RUNNING    | 172.28.0.2:11622 |     10 |          111.18  | 10801 |    -1601 |
| PPO_flatland_sparse_00002 | PENDING    |                  |        |                  |       |          |
+---------------------------+------------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |          145.959 | 15429 |    -1571 |
| PPO_flatland_sparse_00001 | RUNNING    | 172.28.0.2:11622 |     11 |          120.736 | 11901 |    -1601 |
| PPO_flatland_sparse_00002 | PENDING    |                  |        |                  |       |          |
+---------------------------+------------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |          145.959 | 15429 | -1571    |
| PPO_flatland_sparse_00001 | RUNNING    | 172.28.0.2:11622 |     12 |          130.868 | 12992 | -1599.25 |
| PPO_flatland_sparse_00002 | PENDING    |                  |        |                  |       |          |
+---------------------------+------------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |          145.959 | 15429 | -1571    |
| PPO_flatland_sparse_00001 | RUNNING    | 172.28.0.2:11622 |     13 |          139.139 | 14230 | -1599.25 |
| PPO_flatland_sparse_00002 | PENDING    |                  |        |                  |       |          |
+---------------------------+------------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |          145.959 | 15429 | -1571    |
| PPO_flatland_sparse_00001 | RUNNING    | 172.28.0.2:11622 |     14 |          146.76  | 15230 | -1599.25 |
| PPO_flatland_sparse_00002 | PENDING    |                  |        |                  |       |          |
+---------------------------+------------+------------------+--------+------------------+-------+----------+


(pid=11915) -    Successfully Loaded Generator Config small_stoch_v1 from small_stoch_v1.yaml
(pid=11915) -    Successfully Loaded Generator Config 32x32_v0 from 32x32_v0.yaml
(pid=11915) -    Successfully Loaded Generator Config small_stoch_v0 from small_stoch_v0.yaml
(pid=11915) -    Successfully Loaded Generator Config medium_stoch_v2 from medium_stoch_v2.yaml
(pid=11915) -    Successfully Loaded Generator Config large_stoch_v0 from large_stoch_v0.yaml
(pid=11915) -    Successfully Loaded Generator Config small_v0 from small_v0.yaml
(pid=11915) -    Successfully Loaded Generator Config small_single_v0 from small_single_v0.yaml
(pid=11915) -    Successfully Loaded Generator Config small_double_v0 from small_double_v0.yaml
(pid=11915) 2020-10-10 10:54:32,805	INFO trainer.py:421 -- Tip: set 'eager': true or the --eager flag to enable TensorFlow eager execution
(pid=11915) -    Successfully Loaded Generator Config adrian_v0 from adrian_v0.yaml
(pid=11915) -    Successfully Loaded Generator Config small_triple_v0 from small_triple_v0.yaml
(pid=11915) -    Successfully Loaded Generator Config medium_stoch_v1 from medium_stoch_v1.yaml
(pid=11915) -    Successfully Loaded Evaluation Config test_render from test_render.yaml
(pid=11915) -    Successfully Loaded Evaluation Config default from default.yaml
(pid=11915) -    Successfully Loaded Evaluation Config default_render from default_render.yaml
(pid=11915) -    Successfully Loaded Evaluation Config enable_explore from enable_explore.yaml
(pid=11915) -    Successfully Loaded Observation class TreeObs from tree_obs.py
(pid=11915) -    Successfully Loaded Observation class LocalConflictObs from local_conflict_obs.py
(pid=11915) -    Successfully Loaded Observation class Utils from utils.py
(pid=11915) -    Successfully Loaded Observation class CombinedObs from combined_obs.py
(pid=11915) -    Successfully Loaded Observation class ForwardActionObs from forward_action_obs.py
(pid=11915) -    Successfully Loaded Observation class NewTreeObs from new_tree_obs.py
(pid=11915) -    Successfully Loaded Observation class NewTreeObsBuilder from new_tree_obs_builder.py
(pid=11915) -    Successfully Loaded Observation class RandomActionObs from random_action_obs.py
(pid=11915) -    Successfully Loaded Observation class ShortestPathObs from shortest_path_obs.py
(pid=11915) -    Successfully Loaded Observation class GlobalObs from global_obs.py
(pid=11915) -    Successfully Loaded Observation class ShortestPathActionObs from shortest_path_action_obs.py
(pid=11915) -    Successfully Loaded Observation class GlobalDensityObs from global_density_obs.py
(pid=11915) ==================================================
(pid=11915) {'grid_mode': False,
(pid=11915)  'height': 25,
(pid=11915)  'max_num_cities': 4,
(pid=11915)  'max_rails_between_cities': 2,
(pid=11915)  'max_rails_in_city': 3,
(pid=11915)  'number_of_agents': 5,
(pid=11915)  'regenerate_rail_on_reset': True,
(pid=11915)  'regenerate_schedule_on_reset': True,
(pid=11915)  'seed': 0,
(pid=11915)  'width': 25}
(pid=11915) ==================================================
(pid=11915) 2020-10-10 10:54:33,804	WARNING deprecation.py:30 -- DeprecationWarning: `callbacks dict interface` has been deprecated. Use `a class extending rllib.agents.callbacks.DefaultCallbacks` instead. This will raise an error in the future!
(pid=11915) 2020-10-10 10:54:33,804	INFO trainer.py:580 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
(pid=11915) 2020-10-10 10:54:33,805	WARNING deprecation.py:30 -- DeprecationWarning: `callbacks dict interface` has been deprecated. Use `a class extending rllib.agents.callbacks.DefaultCallbacks` instead. This will raise an error in the future!
(pid=11915) /usr/local/envs/flatland-paper/lib/python3.7/site-packages/flatland/envs/rail_generators.py:725: UserWarning: Could not set all required cities!
(pid=11915)   "Could not set all required cities!")
(pid=11915) 2020-10-10 10:54:41,459	INFO trainable.py:217 -- Getting current IP.
(pid=11957) -    Successfully Loaded Generator Config small_stoch_v1 from small_stoch_v1.yaml
(pid=11957) -    Successfully Loaded Generator Config 32x32_v0 from 32x32_v0.yaml
(pid=11957) -    Successfully Loaded Generator Config small_stoch_v0 from small_stoch_v0.yaml
(pid=11957) -    Successfully Loaded Generator Config medium_stoch_v2 from medium_stoch_v2.yaml
(pid=11957) -    Successfully Loaded Generator Config large_stoch_v0 from large_stoch_v0.yaml
(pid=11957) -    Successfully Loaded Generator Config small_v0 from small_v0.yaml
(pid=11957) -    Successfully Loaded Generator Config small_single_v0 from small_single_v0.yaml
(pid=11957) -    Successfully Loaded Generator Config small_double_v0 from small_double_v0.yaml
(pid=11957) -    Successfully Loaded Generator Config adrian_v0 from adrian_v0.yaml
(pid=11957) -    Successfully Loaded Generator Config small_triple_v0 from small_triple_v0.yaml
(pid=11957) -    Successfully Loaded Generator Config medium_stoch_v1 from medium_stoch_v1.yaml
(pid=11957) -    Successfully Loaded Evaluation Config test_render from test_render.yaml
(pid=11957) -    Successfully Loaded Evaluation Config default from default.yaml
(pid=11957) -    Successfully Loaded Evaluation Config default_render from default_render.yaml
(pid=11957) -    Successfully Loaded Evaluation Config enable_explore from enable_explore.yaml
(pid=11957) -    Successfully Loaded Observation class TreeObs from tree_obs.py
(pid=11957) -    Successfully Loaded Observation class LocalConflictObs from local_conflict_obs.py
(pid=11957) -    Successfully Loaded Observation class Utils from utils.py
(pid=11957) -    Successfully Loaded Observation class CombinedObs from combined_obs.py
(pid=11957) -    Successfully Loaded Observation class ForwardActionObs from forward_action_obs.py
(pid=11957) -    Successfully Loaded Observation class NewTreeObs from new_tree_obs.py
(pid=11957) -    Successfully Loaded Observation class NewTreeObsBuilder from new_tree_obs_builder.py
(pid=11957) -    Successfully Loaded Observation class RandomActionObs from random_action_obs.py
(pid=11957) -    Successfully Loaded Observation class ShortestPathObs from shortest_path_obs.py
(pid=11957) -    Successfully Loaded Observation class GlobalObs from global_obs.py
(pid=11957) -    Successfully Loaded Observation class ShortestPathActionObs from shortest_path_action_obs.py
(pid=11957) -    Successfully Loaded Observation class GlobalDensityObs from global_density_obs.py
(pid=11957) 2020-10-10 10:54:44,221	WARNING deprecation.py:30 -- DeprecationWarning: `callbacks dict interface` has been deprecated. Use `a class extending rllib.agents.callbacks.DefaultCallbacks` instead. This will raise an error in the future!
(pid=11957) /usr/local/envs/flatland-paper/lib/python3.7/site-packages/flatland/envs/rail_generators.py:725: UserWarning: Could not set all required cities!
(pid=11957)   "Could not set all required cities!")
(pid=11957) /usr/local/envs/flatland-paper/lib/python3.7/site-packages/flatland/envs/rail_generators.py:725: UserWarning: Could not set all required cities!
(pid=11957)   "Could not set all required cities!")
==================================================
Setting up new w&b logger
Experiment tag: 2
Experiment id: 028ad485feb94fd89a88a95fee45c8c4
==================================================
== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 RUNNING, 2 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |         145.959  | 15429 | -1571    |
| PPO_flatland_sparse_00001 | TERMINATED |                  |     14 |         146.76   | 15230 | -1599.25 |
| PPO_flatland_sparse_00002 | RUNNING    | 172.28.0.2:11915 |      1 |          23.1527 |  1196 |   nan    |
+---------------------------+------------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 RUNNING, 2 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |         145.959  | 15429 | -1571    |
| PPO_flatland_sparse_00001 | TERMINATED |                  |     14 |         146.76   | 15230 | -1599.25 |
| PPO_flatland_sparse_00002 | RUNNING    | 172.28.0.2:11915 |      2 |          36.2727 |  2246 |   nan    |
+---------------------------+------------+------------------+--------+------------------+-------+----------+


(pid=11957) /usr/local/envs/flatland-paper/lib/python3.7/site-packages/flatland/envs/rail_generators.py:647: UserWarning: [WARNING] Changing to Grid mode to place at least 2 cities.
(pid=11957)   warnings.warn("[WARNING] Changing to Grid mode to place at least 2 cities.")
== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 RUNNING, 2 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |         145.959  | 15429 | -1571    |
| PPO_flatland_sparse_00001 | TERMINATED |                  |     14 |         146.76   | 15230 | -1599.25 |
| PPO_flatland_sparse_00002 | RUNNING    | 172.28.0.2:11915 |      3 |          48.4435 |  3388 | -1527.5  |
+---------------------------+------------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 RUNNING, 2 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |         145.959  | 15429 | -1571    |
| PPO_flatland_sparse_00001 | TERMINATED |                  |     14 |         146.76   | 15230 | -1599.25 |
| PPO_flatland_sparse_00002 | RUNNING    | 172.28.0.2:11915 |      4 |          59.6103 |  4392 | -1527.5  |
+---------------------------+------------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 RUNNING, 2 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |          145.959 | 15429 | -1571    |
| PPO_flatland_sparse_00001 | TERMINATED |                  |     14 |          146.76  | 15230 | -1599.25 |
| PPO_flatland_sparse_00002 | RUNNING    | 172.28.0.2:11915 |      5 |           67.919 |  5392 | -1527.5  |
+---------------------------+------------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 RUNNING, 2 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |         145.959  | 15429 | -1571    |
| PPO_flatland_sparse_00001 | TERMINATED |                  |     14 |         146.76   | 15230 | -1599.25 |
| PPO_flatland_sparse_00002 | RUNNING    | 172.28.0.2:11915 |      6 |          79.5559 |  6412 | -1599.75 |
+---------------------------+------------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 RUNNING, 2 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |         145.959  | 15429 | -1571    |
| PPO_flatland_sparse_00001 | TERMINATED |                  |     14 |         146.76   | 15230 | -1599.25 |
| PPO_flatland_sparse_00002 | RUNNING    | 172.28.0.2:11915 |      7 |          87.5625 |  7542 | -1599.75 |
+---------------------------+------------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 RUNNING, 2 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |         145.959  | 15429 | -1571    |
| PPO_flatland_sparse_00001 | TERMINATED |                  |     14 |         146.76   | 15230 | -1599.25 |
| PPO_flatland_sparse_00002 | RUNNING    | 172.28.0.2:11915 |      8 |          98.2645 |  8642 | -1599.75 |
+---------------------------+------------+------------------+--------+------------------+-------+----------+

Saving full experiment config: ppo-tree-obs-small-v0.yaml

== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 RUNNING, 2 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |          145.959 | 15429 | -1571    |
| PPO_flatland_sparse_00001 | TERMINATED |                  |     14 |          146.76  | 15230 | -1599.25 |
| PPO_flatland_sparse_00002 | RUNNING    | 172.28.0.2:11915 |      9 |          106.312 |  9792 | -1599.75 |
+---------------------------+------------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 RUNNING, 2 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |          145.959 | 15429 | -1571    |
| PPO_flatland_sparse_00001 | TERMINATED |                  |     14 |          146.76  | 15230 | -1599.25 |
| PPO_flatland_sparse_00002 | RUNNING    | 172.28.0.2:11915 |     10 |          117.031 | 10868 | -1691.83 |
+---------------------------+------------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 RUNNING, 2 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |          145.959 | 15429 | -1571    |
| PPO_flatland_sparse_00001 | TERMINATED |                  |     14 |          146.76  | 15230 | -1599.25 |
| PPO_flatland_sparse_00002 | RUNNING    | 172.28.0.2:11915 |     11 |          124.654 | 11868 | -1691.83 |
+---------------------------+------------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 RUNNING, 2 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |          145.959 | 15429 | -1571    |
| PPO_flatland_sparse_00001 | TERMINATED |                  |     14 |          146.76  | 15230 | -1599.25 |
| PPO_flatland_sparse_00002 | RUNNING    | 172.28.0.2:11915 |     12 |          135     | 12868 | -1691.83 |
+---------------------------+------------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 RUNNING, 2 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |          145.959 | 15429 | -1571    |
| PPO_flatland_sparse_00001 | TERMINATED |                  |     14 |          146.76  | 15230 | -1599.25 |
| PPO_flatland_sparse_00002 | RUNNING    | 172.28.0.2:11915 |     13 |          144.015 | 13952 | -1690.5  |
+---------------------------+------------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 RUNNING, 2 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |          145.959 | 15429 | -1571    |
| PPO_flatland_sparse_00001 | TERMINATED |                  |     14 |          146.76  | 15230 | -1599.25 |
| PPO_flatland_sparse_00002 | RUNNING    | 172.28.0.2:11915 |     14 |          153.332 | 14952 | -1690.5  |
+---------------------------+------------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 2/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (1 RUNNING, 2 TERMINATED)
+---------------------------+------------+------------------+--------+------------------+-------+----------+
| Trial name                | status     | loc              |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+------------------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |                  |     14 |          145.959 | 15429 | -1571    |
| PPO_flatland_sparse_00001 | TERMINATED |                  |     14 |          146.76  | 15230 | -1599.25 |
| PPO_flatland_sparse_00002 | RUNNING    | 172.28.0.2:11915 |     15 |          162.568 | 15952 | -1690.5  |
+---------------------------+------------+------------------+--------+------------------+-------+----------+


== Status ==
Memory usage on this node: 2.0/12.7 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/2 CPUs, 0/0 GPUs, 0.0/7.32 GiB heap, 0.0/2.54 GiB objects
Result logdir: /root/ray_results/ppo-tree-obs-small-v0
Number of trials: 3 (3 TERMINATED)
+---------------------------+------------+-------+--------+------------------+-------+----------+
| Trial name                | status     | loc   |   iter |   total time (s) |    ts |   reward |
|---------------------------+------------+-------+--------+------------------+-------+----------|
| PPO_flatland_sparse_00000 | TERMINATED |       |     14 |          145.959 | 15429 | -1571    |
| PPO_flatland_sparse_00001 | TERMINATED |       |     14 |          146.76  | 15230 | -1599.25 |
| PPO_flatland_sparse_00002 | TERMINATED |       |     15 |          162.568 | 15952 | -1690.5  |
+---------------------------+------------+-------+--------+------------------+-------+----------+

2020-10-10 10:49:09,241	INFO resource_spec.py:212 -- Starting Ray with 7.32 GiB memory available for workers and up to 3.68 GiB for objects. You can adjust these settings with ray.init(memory=<bytes>, object_store_memory=<bytes>).
2020-10-10 10:49:09,709	INFO services.py:1170 -- View the Ray dashboard at 172.28.0.2:8265
2020-10-10 10:49:10,114	WARNING logger.py:314 -- JsonLogger not provided. The ExperimentAnalysis tool is disabled.
wandb: W&B is a tool that helps track and visualize machine learning experiments
wandb: No credentials found.  Run "wandb login" to visualize your metrics
wandb: Tracking run with wandb version 0.9.2
wandb: Wandb version 0.10.5 is available!  To upgrade, please run:
wandb:  $ pip install wandb --upgrade
wandb: Run data is saved locally in wandb/run-20201010_104934-1uv5lzig

wandb: Program ended successfully.
wandb: W&B is a tool that helps track and visualize machine learning experiments
wandb: No credentials found.  Run "wandb login" to visualize your metrics
wandb: Tracking run with wandb version 0.9.2
wandb: You can sync this run to the cloud by running: 
wandb: wandb sync wandb/run-20201010_104934-1uv5lzig
wandb: Wandb version 0.10.5 is available!  To upgrade, please run:
wandb:  $ pip install wandb --upgrade
wandb: Run data is saved locally in wandb/run-20201010_105221-i57y6ony

wandb: Program ended successfully.
wandb: W&B is a tool that helps track and visualize machine learning experiments
wandb: No credentials found.  Run "wandb login" to visualize your metrics
wandb: Tracking run with wandb version 0.9.2
wandb: You can sync this run to the cloud by running: 
wandb: wandb sync wandb/run-20201010_105221-i57y6ony
wandb: Wandb version 0.10.5 is available!  To upgrade, please run:
wandb:  $ pip install wandb --upgrade
wandb: Run data is saved locally in wandb/run-20201010_105504-2f0d572t

wandb: Program ended successfully.
wandb: You can sync this run to the cloud by running: 
wandb: wandb sync wandb/run-20201010_105504-2f0d572t

⛸RLLib Training LifeCycle¶

We also officially support saving training metrics, graphs, checkpoints, system runtime , experiment code etc in the experiment tracking tool Weights and Biases (w\&b). This also ensures all our experiments are transparent and easily reproducible.

The flatland metrics such as mean percentage completion, normalised reward and reward can be easily monitored

Evaluation can also be done simultaneously with training with a fixed periodicity. To use default evaluation settings one can just add a -e flag as follows

python train.py -ef ppo_tree_obs_small_v0.yaml

A sample recording in w\&b can be viewed here.

One can also specify a custom evaluation config in a yaml file similar to the training configs.

The flatland environment has also been suitably adapted to support saving video recording using the OpenAI's gym monitor. This has been integrated into rllib and one can directly upload these saved videos into w\&b during the training process. Video recording can slow down training considerably, so by default we only save videos of 5 episodes run during evaluation after every 50 training iterations. To use default evaluation and recording settings one can just add -er flag as follows

python train.py -erf ppo_tree_obs_small_v0.yaml

Just like evaluation one can also specify a custom config for recording in a yaml file.

Once we have all of these things,it is very convenient to track our training process directly in w\&b which has separate sections for training, evaluation and media section for recorded videos To save the checkpoints to w\&b one can just add a -s flag as follows

python train.py -ersf ppo_tree_obs_small_v0.yaml

The saved checkpoints can then be downloaded from weights and biases.

Once the training and evaluation is done, we can select the checkpoint corresponding to the best evaluation scores to run inference on a sample of 50 independent flatland environments:

    python rollout.py <path-to-checkpoint> --run PPO --episode=50
    --no-render

Refer to the rollout scripts here for the different baseline RLLib runs

🚆 RLib Training Scripts for baselines¶

We run a range of baselines from apex, ppo , imitation learning approaches as explained here. The train scripts for each of them is shown below. The results for the training are also shown in the subsequent sections.

MODEL	Training Script
APEX FIXED IL(25%)	train.py -ef baselines/imitation_learning_tree_obs/apex_il_tree_obs_25.yaml
APEX FIXED IL(100%)	train.py -ef baselines/imitation_learning_tree_obs/apex_pure_il.yaml
APEX	train.py -ef baselines/action_masking_and_skipping/apex_tree_obs_small_v0.yaml
APEX SKIP	train.py -ef baselines/action_masking_and_skipping/apex_tree_obs_small_v0_skip.yaml
CCPPO	train.py -ef baselines/ccppo_tree_obs/ccppo.yaml
CCPPO BASE	train.py -ef baselines/ccppo_tree_obs/ccppo_base.yaml
MARWIL FIXED IL(100%)	train.py -ef baselines/imitation_learning_tree_obs/marwil_tree_obs_all_beta.yaml
PPO + Online IL(50%)	train.py -ief baselines/custom_imitation_learning_rllib_tree_obs/ppo_imitation_tree_obs.yaml --eager --trace
PPO	train.py -ef baselines/action_masking_and_skipping/ppo_tree_obs_small_v0.yaml
PPO MASKING	train.py -ef baselines/action_masking_and_skipping/ppo_tree_obs_small_v0_mask.yaml
PPO SKIP	train.py -ef baselines/action_masking_and_skipping/ppo_tree_obs_small_v0_skip.yaml
APEX Global density	train.py -ef baselines/global_density_obs/sparse_small_apex_expdecay_maxt1000.yaml
Online IL(100%)	train.py -ef baselines/custom_imitation_learning_rllib_tree_obs/pure_imitation_tree_obs.yaml --eager --trace

⛳️ Results¶

All Baselines run configs can be found here. More information on each of the runs can be found in the 🔗Flatland RLLib Baselines documentation

Checkpoints with the best evaluation normalized reward score for various runs can be found here

🚅 Train, Evaluation and Test Results¶

Note that these runs were based on the older flatland version 2.2.1 and the code for that can be found in the flatland-paper-baselines branch. Test results were calculated using this rollout script

The Training and Evaluation metrics and charts for all the runs can be found in the w\&b link here

MODEL	Train		Best Evaluation		Test
	% Complete	Reward	% Complete	Reward	% Complete	Reward
APEX FIXED IL(25%)	90.45±0.4	-0.18±0	89.18±2.44	-0.18±0.02	86±1.44	-0.22±0.01
APEX FIXED IL(100%)			23.8±11.04	-0.79±0.09	22.93±11.83	-0.84±0.08
APEX	90.38±1.64	-0.2±0.02	85.33±6.11	-0.22±0.05	80.93±5.45	-0.32±0.04
APEX SKIP	89.51±1.09	-0.21±0.01	84±4	-0.22±0.04	79.73±0.92	-0.33±0.01
CCPPO	87.72±2.37	-0.2±0.02	84.67±4.01	-0.23±0.04	71.87±3.7	-0.35±0.03
CCPPO BASE	83.21±1.47	-0.25±0.01	83.2±0.8	-0.25±0.02	76.27±6.96	-0.31±0.06
MARWIL FIXED IL(100%)			100±0	-0.04±0.01	72.4±3.27	-0.35±0.02
PPO + Online IL(50%)	83.46±1.09	-0.23±0.01	100±0	-0.07±0	71.47±4.01	-0.35±0.04
PPO	94.78±0.29	-0.13±0.01	98.67±2.31	-0.09±0.02	81.33±5.86	-0.26±0.05
PPO MASKING	93.4±0.27	-0.15±0	90.67±8.33	-0.16±0.07	80.53±9.59	-0.28±0.09
PPO SKIP	93.48±0.66	-0.15±0.01	100±0	-0.08±0.01	82.67±5.79	-0.26±0.05
APEX Global density	57.87±1.85	-0.51±0.01	60±10.58	-0.45±0.11	34.4±9.23	-0.71±0.07
Online IL(100%)			100±0	-0.06±0.01	80±3.27	-0.27±0.03

🚀 Submitting Solution to Challenge!!!¶

Thanks to the efforts of our partners Deutsche Bahn and Instadeep, you can now submit the CCPPO baseline out of the box: https://gitlab.aicrowd.com/GereonVienken/db_flatland_example 2

This RL method reaches a score of 76.232 on the leaderboard! 💪🏻

You should be able to use the same approach with the other RLlib baselines as well. Make sure to give your best performing checkpoints in the submission. Thanks to our partners and especially to @GereonVienken who contributed this baseline and submission repository!

Content

3703

Show Comments

Comments

You must login before you can post a comment.