Trust Region Policy Optimisation in Multi-Agent Reinforcement Learning (2021-09-23T00:00:00.000000Z)