First published: 2018/11/14 (5 years ago) Abstract: While current benchmark reinforcement learning (RL) tasks have been useful to
drive progress in the field, they are in many ways poor substitutes for
learning with real-world data. By testing increasingly complex RL algorithms on
low-complexity simulation environments, we often end up with brittle RL
policies that generalize poorly beyond the very specific domain. To combat
this, we propose three new families of benchmark RL domains that contain some
of the complexity of the natural world, while still supporting fast and
extensive data acquisition. The proposed domains also permit a characterization
of generalization through fair train/test separation, and easy comparison and
replication of results. Through this work, we challenge the RL research
community to develop more robust algorithms that meet high standards of
evaluation.
This paper proposed three new reinforcement learning tasks which involved dealing with images.
- Task 1: An agent crawls across a hidden image, revealing portions of it at each step. It must classify the image in the minimum amount of steps. For example classify the image as a cat after choosing to travel across the ears.
- Task 2: The agent crawls across a visible image to sit on it's target. For example a cat in a scene of pets.
- Task 3: The agent plays an Atari game where the background has been replaced with a distracting video.
These tasks are easy to construct, but solving them requires large scale visual processing or attention, which typically require deep networks. To address these new tasks, popular RL agents (PPO, A2C, and ACKTR) were augmented with a deep image processing network (ResNet-18), but they still performed poorly.