This is a fork of OTTO used for benchmarking solvers on the olfactory search POMDP in the paper "Deep reinforcement learning for the olfactory search POMDP: a quantitative benchmark", by Aurore Loisy and Robin A. Heinonen (The European Physical Journal, 2023).
Refer to the original repository for tutorials and extensive documentation.
New features:
- a new "windy" setup has been added to the original "isotropic" one;
- alpha-vector policies obtained from solvers using point-based value iteration (PBVI), namely Sarsop and Perseus, can be loaded;
- CNN and reward shaping have been implemented in the "windy" setup (not used in the benchmark).
OTTO-benchmark requires Python 3.8 or greater. Dependencies are listed in requirements.txt, missing dependencies will be installed automatically.
If you use conda to manage your Python environments, you can install OTTO-benchmark in a dedicated environment otto-benchmark
conda create --name otto-benchmark python=3.8
conda activate otto-benchmark
python3 setup.py install
The policies computed using DRL, Sarsop and Perseus can be downloaded here.
Decompress the file and place the zoo
folder at the root of OTTO-benchmark (at the same level as isotropic
, windy
,
and converter-pbvi-to-otto
folders).
The software contains 2 main directories, "isotropic" and "windy", containing the 2 variants of the POMDP. They organized in the exact same way. They contain three subdirectories:
evaluate
: for evaluating the performance of a policylearn
: for learning a DRL policy for the taskvisualize
: for visualizing a search episode
The code organization and usage is self-explanatory. It is explained through examples below.
The four test cases used in the benchmark are:
- isotropic-19x19
- isotropic-53x53
- windy-medium-detections
- windy-low-detections
To visualize a search for the "windy-medium-detections" case, go to windy/visualize
and run
python3 visualize.py -i windy-medium-detections
The policy is selected by modifying the variable POLICY
in the file
windy/visualize/parameters/windy-medium-detections.py
and, if needed, setting the path to the desired policy.
To evaluate a policy for the "isotropic-19x19", go to isotropic/evaluate
and run
python3 evaluate.py -i isotropic-19x19
To change the policy, edit the isotropic/evaluate/parameters/isotropic-19x19.py
file as done above for the visualization.
To learn a DRL policy for "isotropic-53x53" case, go to isotropic/learn
and run
python3 learn.py -i isotropic-53x53
Change hyperparameters by editing the isotropic/learn/parameters/isotropic-53x53.py
file.
PVBI policies in the benchmark have been computed using Sarsop and a custom implementation of Perseus.
The policies obtained from these solvers can be used by OTTO-benchmark
after conversion. For that, go to the folder converter-pbvi-to-otto
, edit the file convert_perseus_file.py
or convert_sarsop_file.py
to set the correct paths at the beginning of the file, and run it.
There is no need to convert the policies that are downloadable.
If you use this software in your publications, you can cite the package as follows:
Loisy, A. and Eloy, C. (2022). OTTO: A Python package to simulate, solve and visualize the source-tracking POMDP. Journal of Open Source Software, 7(74), 4266, https://doi.org/10.21105/joss.04266
or if you use LaTeX:
@article{otto,
doi = {10.21105/joss.04266},
year = {2022},
volume = {7},
number = {74},
pages = {4266},
author = {Loisy, A. and Eloy, C.},
title = {OTTO: A Python package to simulate, solve and visualize the source-tracking POMDP},
journal = {Journal of Open Source Software}
}