Literature Review - 2
Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents
This paper compared different agent behaviors and results when agents: 1. learn independently, 2. share information at every step, 3. share policies every n steps, 4. share their entire experiences after every episode.
Environment is a 1/2 hunter vs. 1/2 prey game. Agents use Q-learning algorithm. Both reward situations were explored: 1. solo capture, 2. teamwork capture. State space may expand when agents are cooperative, and therefore means more state exploration and slower learning.
Result: Cooperative agents always outperform independent agents, however worse performance is observed when the information other agent provides is insufficient. Average steps in independent and cooperative scenarios may eventually converge to similar numbers. One experiment that shows the largest advantage is when learning episode from an expert (an already learned agent).