Storage and Transfer Mechanism in Reinforcement Learning

The idea of reusing information from previously learned tasks (source tasks) for the learning of new tasks (target tasks) has the potential to significantly improve the sample efficiency reinforcement learning agents. In this work, we describe an approach to concisely store and represent learned task knowledge, and reuse it by allowing it to guide the exploration of an agent while it learns new tasks. In order to do so, we use a measure of similarity that is defined directly in the space of parameterized representations of the value functions. This similarity measure is also used as a basis for a variant of the growing self-organizing map algorithm, which is simultaneously used to enable the storage of previously acquired task knowledge in an adaptive and scalable manner. We empirically validate our approach in a simulated navigation environment and discuss possible extensions to this approach along with potential applications where it could be particularly useful.

Self-Organizing Maps as a Storage and Transfer Mechanism in Reinforcement Learning
Karimpanal TG & Bouffanais R
Adaptive Learning Agents (ALA-AAMAS 2018), Stockholm, Sweden, 1-6, 2018. [pdf]