plangym.videogames

Module that contains environments representing video games.

Submodules

Package Contents

Classes

AtariEnv

Create an environment to play OpenAI gym Atari Games that uses AtariALE as the emulator.

MontezumaEnv

Plangym implementation of the MontezumaEnv environment optimized for planning.

MarioEnv

Interface for using gym-super-mario-bros in plangym.

RetroEnv

Environment for playing gym-retro games.

class plangym.videogames.AtariEnv(name, frameskip=5, episodic_life=False, autoreset=True, delay_setup=False, remove_time_limit=True, obs_type='rgb', mode=0, difficulty=0, repeat_action_probability=0.0, full_action_space=False, render_mode=None, possible_to_win=False, wrappers=None, array_state=True, clone_seeds=False, **kwargs)[source]

Bases: plangym.videogames.env.VideogameEnv

Create an environment to play OpenAI gym Atari Games that uses AtariALE as the emulator.

Parameters
  • name (str) – Name of the environment. Follows standard gym syntax conventions.

  • frameskip (int) – Number of times an action will be applied for each step in dt.

  • episodic_life (bool) – Return end = True when losing a life.

  • autoreset (bool) – Restart environment when reaching a terminal state.

  • delay_setup (bool) – If True do not initialize the gym.Environment and wait for setup to be called later.

  • remove_time_limit (bool) – If True, remove the time limit from the environment.

  • obs_type (str) – One of {“rgb”, “ram”, “grayscale”}.

  • mode (int) – Integer or string indicating the game mode, when available.

  • difficulty (int) – Difficulty level of the game, when available.

  • repeat_action_probability (float) – Repeat the last action with this probability.

  • full_action_space (bool) – Wheter to use the full range of possible actions or only those available in the game.

  • render_mode (Optional[str]) – One of {None, “human”, “rgb_aray”}.

  • possible_to_win (bool) – It is possible to finish the Atari game without getting a terminal state that is not out of bounds or does not involve losing a life.

  • wrappers (Iterable[plangym.core.wrap_callable]) – Wrappers that will be applied to the underlying OpenAI env. Every element of the iterable can be either a gym.Wrapper or a tuple containing (gym.Wrapper, kwargs).

  • array_state (bool) – Whether to return the state of the environment as a numpy array.

  • clone_seeds (bool) – Clone the random seed of the ALE emulator when reading/setting the state. False makes the environment stochastic.

Example:

>>> env = plangym.make(name="ALE/MsPacman-v5", difficulty=2, mode=1)
>>> state, obs = env.reset()
>>>
>>> states = [state.copy() for _ in range(10)]
>>> actions = [env.action_space.sample() for _ in range(10)]
>>>
>>> data = env.step_batch(states=states, actions=actions)
>>> new_states, observs, rewards, ends, infos = data
STATE_IS_ARRAY = True
property ale(self)

Return the ale interface of the underlying gym.Env.

Example:

>>> env = AtariEnv(name="ALE/MsPacman-v5", obs_type="ram")
>>> type(env.ale)
<class 'ale_py._ale_py.ALEInterface'>
property mode(self)

Return the selected game mode for the current environment.

Return type

int

property difficulty(self)

Return the selected difficulty for the current environment.

Return type

int

property repeat_action_probability(self)

Probability of repeating the same action after input.

Return type

float

property full_action_space(self)

If True the action space correspond to all possible actions in the Atari emulator.

Return type

bool

property observation_space(self)

Return the observation_space of the environment.

Return type

gym.spaces.Space

static _get_default_obs_type(name, obs_type)[source]

Return the observation type of the internal Atari gym environment.

Return type

str

get_lifes_from_info(self, info)[source]

Return the number of lives remaining in the current game.

Parameters

info (Dict[str, Any]) –

Return type

int

get_image(self)[source]

Return a numpy array containing the rendered view of the environment.

Image is a three-dimensional array interpreted as an RGB image with channels (Height, Width, RGB). Ignores wrappers as it loads the screen directly from the emulator.

Example:

>>> env = AtariEnv(name="ALE/MsPacman-v5", obs_type="ram")
>>> img = env.get_image()
>>> img.shape
(210, 160, 3)
Return type

numpy.ndarray

get_ram(self)[source]

Return a numpy array containing the content of the emulator’s RAM.

The RAM is a vector array interpreted as the memory of the emulator.

Example:

>>> env = AtariEnv(name="ALE/MsPacman-v5", obs_type="grayscale")
>>> ram = env.get_ram()
>>> ram.shape, ram.dtype
((128,), dtype('uint8'))
Return type

numpy.ndarray

init_gym_env(self)[source]

Initialize the gym.Env` instance that the Environment is wrapping.

Return type

gym.Env

get_state(self)[source]

Recover the internal state of the simulation.

If clone seed is False the environment will be stochastic. Cloning the full state ensures the environment is deterministic.

Example:

>>> env = AtariEnv(name="Qbert-v0")
>>> env.get_state() 
array([<ale_py._ale_py.ALEState object at 0x...>, None],
      dtype=object)

>>> env = AtariEnv(name="Qbert-v0", array_state=False)
>>> env.get_state() 
<ale_py._ale_py.ALEState object at 0x...>
Return type

numpy.ndarray

set_state(self, state)[source]

Set the internal state of the simulation.

Parameters

state (numpy.ndarray) – Target state to be set in the environment.

Return type

None

Example:

>>> env = AtariEnv(name="Qbert-v0")
>>> state, obs = env.reset()
>>> new_state, obs, reward, end, info = env.step(env.sample_action(), state=state)
>>> assert not (state == new_state).all()
>>> env.set_state(state)
>>> (state == env.get_state()).all()
True
step_with_dt(self, action, dt=1)[source]

Take dt simulation steps and make the environment evolve in multiples of self.frameskip for a total of dt * self.frameskip steps.

Parameters
  • action (Union[numpy.ndarray, int, float]) – Chosen action applied to the environment.

  • dt (int) – Consecutive number of times that the action will be applied.

Returns

If state is None return (observs, reward, terminal, info) else returns (new_state, observs, reward, terminal, info)

Example:

>>> env = AtariEnv(name="Pong-v0")
>>> obs = env.reset(return_state=False)
>>> obs, reward, end, info = env.step_with_dt(env.sample_action(), dt=7)
>>> assert not end
clone(self, **kwargs)[source]

Return a copy of the environment.

Return type

plangym.videogames.env.VideogameEnv

class plangym.videogames.MontezumaEnv(name='PlanMontezuma-v0', frameskip=1, episodic_life=False, autoreset=True, delay_setup=False, remove_time_limit=True, obs_type='rgb', mode=0, difficulty=0, repeat_action_probability=0.0, full_action_space=False, render_mode=None, possible_to_win=True, wrappers=None, array_state=True, clone_seeds=True, **kwargs)[source]

Bases: plangym.videogames.atari.AtariEnv

Plangym implementation of the MontezumaEnv environment optimized for planning.

Parameters
  • frameskip (int) –

  • episodic_life (bool) –

  • autoreset (bool) –

  • delay_setup (bool) –

  • remove_time_limit (bool) –

  • obs_type (str) –

  • mode (int) –

  • difficulty (int) –

  • repeat_action_probability (float) –

  • full_action_space (bool) –

  • render_mode (Optional[str]) –

  • possible_to_win (bool) –

  • wrappers (Iterable[plangym.core.wrap_callable]) –

  • array_state (bool) –

  • clone_seeds (bool) –

AVAILABLE_OBS_TYPES
_get_default_obs_type(self, name, obs_type)[source]

Return the observation type of the internal Atari gym environment.

Return type

str

init_gym_env(self)[source]

Initialize the gum.Env` instance that the current clas is wrapping.

Return type

CustomMontezuma

get_state(self)[source]

Recover the internal state of the simulation.

If clone seed is False the environment will be stochastic. Cloning the full state ensures the environment is deterministic.

Return type

numpy.ndarray

set_state(self, state)[source]

Set the internal state of the simulation.

Parameters

state (numpy.ndarray) – Target state to be set in the environment.

Returns

None

class plangym.videogames.MarioEnv(name, movement_type='simple', original_reward=False, **kwargs)[source]

Bases: NesEnv

Interface for using gym-super-mario-bros in plangym.

Parameters
  • name (str) –

  • movement_type (str) –

  • original_reward (bool) –

AVAILABLE_OBS_TYPES
MOVEMENTS
get_state(self, state=None)[source]

Recover the internal state of the simulation.

A state must completely describe the Environment at a given moment.

Parameters

state (Optional[numpy.ndarray]) –

Return type

numpy.ndarray

init_gym_env(self)[source]

Initialize the NESEnv` instance that the current class is wrapping.

Return type

gym.Env

_update_info(self, info)[source]
Parameters

info (Dict[str, Any]) –

Return type

Dict[str, Any]

_get_info(self)[source]
get_coords_obs(self, obs, info=None, **kwargs)[source]

Return the information contained in info as an observation if obs_type == “info”.

Parameters
  • obs (numpy.ndarray) –

  • info (Dict[str, Any]) –

Return type

numpy.ndarray

process_reward(self, reward, info, **kwargs)[source]

Return a custom reward based on the x, y coordinates and level mario is in.

Return type

float

process_terminal(self, terminal, info, **kwargs)[source]

Return True if terminal or mario is dying.

Return type

bool

process_info(self, info, **kwargs)[source]

Add additional data to the info dictionary.

Return type

Dict[str, Any]

class plangym.videogames.RetroEnv(name, frameskip=5, episodic_life=False, autoreset=True, delay_setup=False, remove_time_limit=True, obs_type='rgb', render_mode=None, wrappers=None, **kwargs)[source]

Bases: plangym.videogames.env.VideogameEnv

Environment for playing gym-retro games.

Parameters
  • name (str) –

  • frameskip (int) –

  • episodic_life (bool) –

  • autoreset (bool) –

  • delay_setup (bool) –

  • remove_time_limit (bool) –

  • obs_type (str) –

  • render_mode (Optional[str]) –

  • wrappers (Iterable[plangym.core.wrap_callable]) –

AVAILABLE_OBS_TYPES
SINGLETON = True
__getattr__(self, item)[source]

Forward getattr to self.gym_env.

static get_win_condition(info)[source]

Get win condition for games that have the end of the screen available.

Parameters

info (Dict[str, Any]) –

Return type

bool

get_ram(self)[source]

Return the ram of the emulator as a numpy array.

Return type

numpy.ndarray

clone(self, **kwargs)[source]

Return a copy of the environment with its initialization delayed.

Return type

RetroEnv

init_gym_env(self)[source]

Initialize the retro environment.

Return type

gym.Env

get_state(self)[source]

Get the state of the retro environment.

Return type

numpy.ndarray

set_state(self, state)[source]

Set the state of the retro environment.

Parameters

state (numpy.ndarray) –

close(self)[source]

Close the underlying gym.Env.