Dyna, an Integrated Architecture for Learning, Planning, and Reacting on ShortScience.org

doi.acm.org
sci-hub
scholar.google.com

Dyna, an Integrated Architecture for Learning, Planning, and Reacting
Sutton, Richard S.
SIGART Bulletin - 1991 via Local Bibsonomy
Keywords: dblp

Summaries/Notes 1

[link] Summary by Tianxiao Zhao 7 years ago

Main idea: planning is 'trying things in your head' using an internal model of the world

#### Diagram

https://i.imgur.com/vDAobu1.png

#### Generic algorithm

- step 1-3: standard reinforcement learning agent
- step 4: learning of domain knowledge - action model
- step 5: RL from hypothetical, model-generated experiences - planning

https://i.imgur.com/G6dZ00F.png

#### Action model

input: state, action; output: immediate resulting state and reward

search control: how to select hypothetical state and action

## Potential problems

1. reliance on supervised learning
2. hierarchical planning
3. ambiguous and hidden state
4. ensuring variety in action
5. taskability
6. incorporation of prior knowledge

Your comment:

Write your summary here (You can use $\LaTeX$ and markdown syntax):

Anon Private