Is it possible to use openai's gym environments for multi-agent games? Specifically, I would like to model a card game with four players (agents). The player scoring a turn starts the next turn. How would I model the necessary coordination between the players (e.g. who's turn it is next)? Ultimately, I would like to use reinforcement learning on four agents that play against each other.
By Ayoosh Kathuria. OpenAI Gym comes packed with a lot of awesome environments, ranging from environments featuring classic control tasks to ones that let you train your agents to play Atari games like Breakout, Pacman, and Seaquest.
Yes, it is possible to use OpenAI gym environments for multi-agent games. Although in the OpenAI gym community there is no standardized interface for multi-agent environments, it is easy enough to build an OpenAI gym that supports this. For instance, in OpenAI's recent work on multi-agent particle environments they make a multi-agent environment that inherits from gym.Env which takes the following form:
class MultiAgentEnv(gym.Env):      def step(self, action_n):         obs_n    = list()         reward_n = list()         done_n   = list()         info_n   = {'n': []}         # ...         return obs_n, reward_n, done_n, info_n We can see that the step function takes a list of actions (one for each agent) and returns a list of observations, list of rewards, list of dones, while stepping the environment forwards. This interface is representative of Markov Game, in which all agents take actions at the same time and each observe their own subsequent observation, reward.
However, this kind of Markov Game interface may not be suitable for all multi-agent environments. In particular, turn-based games (such as card games) might be better cast as an alternating Markov Game, in which agents take turns (i.e. actions) one at a time. For this kind of environment, you may need to include which agent's turn it is in the representation of state, and your step function would then just take a single action, and return a single observation, reward and done.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With