`plangym.videogames`

Module that contains environments representing video games.

Submodules

Package Contents

Classes

`AtariEnv`	Create an environment to play OpenAI gym Atari Games that uses AtariALE as the emulator.
`MontezumaEnv`	Plangym implementation of the MontezumaEnv environment optimized for planning.
`MarioEnv`	Interface for using gym-super-mario-bros in plangym.
`RetroEnv`	Environment for playing `gym-retro` games.

class plangym.videogames.AtariEnv(name, frameskip=5, episodic_life=False, autoreset=True, delay_setup=False, remove_time_limit=True, obs_type='rgb', mode=0, difficulty=0, repeat_action_probability=0.0, full_action_space=False, render_mode=None, possible_to_win=False, wrappers=None, array_state=True, clone_seeds=False, **kwargs)[source]

Bases: plangym.videogames.env.VideogameEnv

Create an environment to play OpenAI gym Atari Games that uses AtariALE as the emulator.

Parameters

name (str) – Name of the environment. Follows standard gym syntax conventions.
frameskip (int) – Number of times an action will be applied for each step in dt.
episodic_life (bool) – Return end = True when losing a life.
autoreset (bool) – Restart environment when reaching a terminal state.
delay_setup (bool) – If True do not initialize the gym.Environment and wait for setup to be called later.
remove_time_limit (bool) – If True, remove the time limit from the environment.
obs_type (str) – One of {“rgb”, “ram”, “grayscale”}.
mode (int) – Integer or string indicating the game mode, when available.
difficulty (int) – Difficulty level of the game, when available.
repeat_action_probability (float) – Repeat the last action with this probability.
full_action_space (bool) – Wheter to use the full range of possible actions or only those available in the game.
render_mode (Optional[str]) – One of {None, “human”, “rgb_aray”}.
possible_to_win (bool) – It is possible to finish the Atari game without getting a terminal state that is not out of bounds or does not involve losing a life.
wrappers (Iterable[plangym.core.wrap_callable]) – Wrappers that will be applied to the underlying OpenAI env. Every element of the iterable can be either a gym.Wrapper or a tuple containing (gym.Wrapper, kwargs).
array_state (bool) – Whether to return the state of the environment as a numpy array.
clone_seeds (bool) – Clone the random seed of the ALE emulator when reading/setting the state. False makes the environment stochastic.

Example:

>>> env = plangym.make(name="ALE/MsPacman-v5", difficulty=2, mode=1)
>>> state, obs = env.reset()
>>>
>>> states = [state.copy() for _ in range(10)]
>>> actions = [env.action_space.sample() for _ in range(10)]
>>>
>>> data = env.step_batch(states=states, actions=actions)
>>> new_states, observs, rewards, ends, infos = data

STATE_IS_ARRAY = True

property ale(self)

Return the ale interface of the underlying gym.Env.

Example:

>>> env = AtariEnv(name="ALE/MsPacman-v5", obs_type="ram")
>>> type(env.ale)
<class 'ale_py._ale_py.ALEInterface'>

property mode(self)

Return the selected game mode for the current environment.

Return type: int

property difficulty(self)

Return the selected difficulty for the current environment.

Return type: int

property repeat_action_probability(self)

Probability of repeating the same action after input.

Return type: float

property full_action_space(self)

If True the action space correspond to all possible actions in the Atari emulator.

Return type: bool

property observation_space(self)

Return the observation_space of the environment.

Return type: gym.spaces.Space

static _get_default_obs_type(name, obs_type)[source]

Return the observation type of the internal Atari gym environment.

Return type: str

get_lifes_from_info(self, info)[source]

Return the number of lives remaining in the current game.

Parameters: info (Dict[str, Any]) –
Return type: int

get_image(self)[source]

Return a numpy array containing the rendered view of the environment.

Image is a three-dimensional array interpreted as an RGB image with channels (Height, Width, RGB). Ignores wrappers as it loads the screen directly from the emulator.

Example:

>>> env = AtariEnv(name="ALE/MsPacman-v5", obs_type="ram")
>>> img = env.get_image()
>>> img.shape
(210, 160, 3)

Return type: numpy.ndarray

get_ram(self)[source]

Return a numpy array containing the content of the emulator’s RAM.

The RAM is a vector array interpreted as the memory of the emulator.

Example:

>>> env = AtariEnv(name="ALE/MsPacman-v5", obs_type="grayscale")
>>> ram = env.get_ram()
>>> ram.shape, ram.dtype
((128,), dtype('uint8'))

Return type: numpy.ndarray

init_gym_env(self)[source]

Initialize the gym.Env` instance that the Environment is wrapping.

Return type: gym.Env

get_state(self)[source]

Recover the internal state of the simulation.

If clone seed is False the environment will be stochastic. Cloning the full state ensures the environment is deterministic.

Example:

>>> env = AtariEnv(name="Qbert-v0")
>>> env.get_state() 
array([<ale_py._ale_py.ALEState object at 0x...>, None],
      dtype=object)

>>> env = AtariEnv(name="Qbert-v0", array_state=False)
>>> env.get_state() 
<ale_py._ale_py.ALEState object at 0x...>

Return type: numpy.ndarray

set_state(self, state)[source]

Set the internal state of the simulation.

Parameters: state (numpy.ndarray) – Target state to be set in the environment.
Return type: None

Example:

>>> env = AtariEnv(name="Qbert-v0")
>>> state, obs = env.reset()
>>> new_state, obs, reward, end, info = env.step(env.sample_action(), state=state)
>>> assert not (state == new_state).all()
>>> env.set_state(state)
>>> (state == env.get_state()).all()
True

step_with_dt(self, action, dt=1)[source]

Take dt simulation steps and make the environment evolve in multiples of self.frameskip for a total of dt * self.frameskip steps.

Parameters

action (Union[numpy.ndarray, int, float]) – Chosen action applied to the environment.
dt (int) – Consecutive number of times that the action will be applied.

Returns

If state is None return (observs, reward, terminal, info) else returns (new_state, observs, reward, terminal, info)

Example:

>>> env = AtariEnv(name="Pong-v0")
>>> obs = env.reset(return_state=False)
>>> obs, reward, end, info = env.step_with_dt(env.sample_action(), dt=7)
>>> assert not end

clone(self, **kwargs)[source]

Return a copy of the environment.

Return type: plangym.videogames.env.VideogameEnv

class plangym.videogames.MontezumaEnv(name='PlanMontezuma-v0', frameskip=1, episodic_life=False, autoreset=True, delay_setup=False, remove_time_limit=True, obs_type='rgb', mode=0, difficulty=0, repeat_action_probability=0.0, full_action_space=False, render_mode=None, possible_to_win=True, wrappers=None, array_state=True, clone_seeds=True, **kwargs)[source]

Bases: plangym.videogames.atari.AtariEnv

Plangym implementation of the MontezumaEnv environment optimized for planning.

Parameters

frameskip (int) –
episodic_life (bool) –
autoreset (bool) –
delay_setup (bool) –
remove_time_limit (bool) –
obs_type (str) –
mode (int) –
difficulty (int) –
repeat_action_probability (float) –
full_action_space (bool) –
render_mode (Optional[str]) –
possible_to_win (bool) –
wrappers (Iterable[plangym.core.wrap_callable]) –
array_state (bool) –
clone_seeds (bool) –

AVAILABLE_OBS_TYPES

_get_default_obs_type(self, name, obs_type)[source]

Return the observation type of the internal Atari gym environment.

Return type: str

init_gym_env(self)[source]

Initialize the gum.Env` instance that the current clas is wrapping.

Return type: CustomMontezuma

get_state(self)[source]

Recover the internal state of the simulation.

If clone seed is False the environment will be stochastic. Cloning the full state ensures the environment is deterministic.

Return type: numpy.ndarray

set_state(self, state)[source]

Set the internal state of the simulation.

Parameters: state (numpy.ndarray) – Target state to be set in the environment.
Returns: None

class plangym.videogames.MarioEnv(name, movement_type='simple', original_reward=False, **kwargs)[source]

Bases: NesEnv

Interface for using gym-super-mario-bros in plangym.

Parameters

name (str) –
movement_type (str) –
original_reward (bool) –

AVAILABLE_OBS_TYPES

MOVEMENTS

get_state(self, state=None)[source]

Recover the internal state of the simulation.

A state must completely describe the Environment at a given moment.

Parameters: state (Optional[numpy.ndarray]) –
Return type: numpy.ndarray

init_gym_env(self)[source]

Initialize the NESEnv` instance that the current class is wrapping.

Return type: gym.Env

_update_info(self, info)[source]

Parameters: info (Dict[str, Any]) –
Return type: Dict[str, Any]

_get_info(self)[source]

get_coords_obs(self, obs, info=None, **kwargs)[source]

Return the information contained in info as an observation if obs_type == “info”.

Parameters

obs (numpy.ndarray) –
info (Dict[str, Any]) –

Return type

numpy.ndarray

process_reward(self, reward, info, **kwargs)[source]

Return a custom reward based on the x, y coordinates and level mario is in.

Return type: float

process_terminal(self, terminal, info, **kwargs)[source]

Return True if terminal or mario is dying.

Return type: bool

process_info(self, info, **kwargs)[source]

Add additional data to the info dictionary.

Return type: Dict[str, Any]

class plangym.videogames.RetroEnv(name, frameskip=5, episodic_life=False, autoreset=True, delay_setup=False, remove_time_limit=True, obs_type='rgb', render_mode=None, wrappers=None, **kwargs)[source]

Bases: plangym.videogames.env.VideogameEnv

Environment for playing gym-retro games.

Parameters

name (str) –
frameskip (int) –
episodic_life (bool) –
autoreset (bool) –
delay_setup (bool) –
remove_time_limit (bool) –
obs_type (str) –
render_mode (Optional[str]) –
wrappers (Iterable[plangym.core.wrap_callable]) –

AVAILABLE_OBS_TYPES

SINGLETON = True

__getattr__(self, item)[source]: Forward getattr to self.gym_env.

static get_win_condition(info)[source]

Get win condition for games that have the end of the screen available.

Parameters: info (Dict[str, Any]) –
Return type: bool

get_ram(self)[source]

Return the ram of the emulator as a numpy array.

Return type: numpy.ndarray

clone(self, **kwargs)[source]

Return a copy of the environment with its initialization delayed.

Return type: RetroEnv

init_gym_env(self)[source]

Initialize the retro environment.

Return type: gym.Env

get_state(self)[source]

Get the state of the retro environment.

Return type: numpy.ndarray

set_state(self, state)[source]

Set the state of the retro environment.

Parameters: state (numpy.ndarray) –

close(self)[source]: Close the underlying gym.Env.

plangym.videogames

Submodules

Package Contents

Classes

`plangym.videogames`