The Options framework

Summary

In Reinforcement Learning, the agent learns to perform some actions in an environment, which is what we call the policy. Instead of having one policy, the agent could instead have a policy over options. Every option contains its own policy and termination function. Suggested by Sutton and others as early as 2000, what is the current state of options?

Read the paper

References

MCP: Learning Composable Hierarchical Control with Multiplicative Compositional Policies (2019)
Peng, Xue Bin and Chang, Michael and Zhang, Grace and Abbeel, Pieter and Levine, Sergey
Learning Options in Reinforcement Learning (2002)
Stolle, Martin and Precup, Doina