plangym.videogames.atari

Implement the plangym API for Atari environments.

Module Contents

Classes

AtariEnv

Create an environment to play OpenAI gym Atari Games that uses AtariALE as the emulator.

AtariPyEnvironment

Create an environment to play OpenAI gym Atari Games that uses AtariPy as the emulator.

Functions

ale_to_ram(ale)

Return the ram of the ale emulator.

plangym.videogames.atari.ale_to_ram(ale)[source]

Return the ram of the ale emulator.

Return type

numpy.ndarray

class plangym.videogames.atari.AtariEnv(name, frameskip=5, episodic_life=False, autoreset=True, delay_setup=False, remove_time_limit=True, obs_type='rgb', mode=0, difficulty=0, repeat_action_probability=0.0, full_action_space=False, render_mode=None, possible_to_win=False, wrappers=None, array_state=True, clone_seeds=False, **kwargs)[source]

Bases: plangym.videogames.env.VideogameEnv

Create an environment to play OpenAI gym Atari Games that uses AtariALE as the emulator.

Parameters
  • name (str) – Name of the environment. Follows standard gym syntax conventions.

  • frameskip (int) – Number of times an action will be applied for each step in dt.

  • episodic_life (bool) – Return end = True when losing a life.

  • autoreset (bool) – Restart environment when reaching a terminal state.

  • delay_setup (bool) – If True do not initialize the gym.Environment and wait for setup to be called later.

  • remove_time_limit (bool) – If True, remove the time limit from the environment.

  • obs_type (str) – One of {“rgb”, “ram”, “grayscale”}.

  • mode (int) – Integer or string indicating the game mode, when available.

  • difficulty (int) – Difficulty level of the game, when available.

  • repeat_action_probability (float) – Repeat the last action with this probability.

  • full_action_space (bool) – Wheter to use the full range of possible actions or only those available in the game.

  • render_mode (Optional[str]) – One of {None, “human”, “rgb_aray”}.

  • possible_to_win (bool) – It is possible to finish the Atari game without getting a terminal state that is not out of bounds or does not involve losing a life.

  • wrappers (Iterable[plangym.core.wrap_callable]) – Wrappers that will be applied to the underlying OpenAI env. Every element of the iterable can be either a gym.Wrapper or a tuple containing (gym.Wrapper, kwargs).

  • array_state (bool) – Whether to return the state of the environment as a numpy array.

  • clone_seeds (bool) – Clone the random seed of the ALE emulator when reading/setting the state. False makes the environment stochastic.

Example:

>>> env = plangym.make(name="ALE/MsPacman-v5", difficulty=2, mode=1)
>>> state, obs = env.reset()
>>>
>>> states = [state.copy() for _ in range(10)]
>>> actions = [env.action_space.sample() for _ in range(10)]
>>>
>>> data = env.step_batch(states=states, actions=actions)
>>> new_states, observs, rewards, ends, infos = data
STATE_IS_ARRAY = True
property ale(self)

Return the ale interface of the underlying gym.Env.

Example:

>>> env = AtariEnv(name="ALE/MsPacman-v5", obs_type="ram")
>>> type(env.ale)
<class 'ale_py._ale_py.ALEInterface'>
property mode(self)

Return the selected game mode for the current environment.

Return type

int

property difficulty(self)

Return the selected difficulty for the current environment.

Return type

int

property repeat_action_probability(self)

Probability of repeating the same action after input.

Return type

float

property full_action_space(self)

If True the action space correspond to all possible actions in the Atari emulator.

Return type

bool

property observation_space(self)

Return the observation_space of the environment.

Return type

gym.spaces.Space

static _get_default_obs_type(name, obs_type)[source]

Return the observation type of the internal Atari gym environment.

Return type

str

get_lifes_from_info(self, info)[source]

Return the number of lives remaining in the current game.

Parameters

info (Dict[str, Any]) –

Return type

int

get_image(self)[source]

Return a numpy array containing the rendered view of the environment.

Image is a three-dimensional array interpreted as an RGB image with channels (Height, Width, RGB). Ignores wrappers as it loads the screen directly from the emulator.

Example:

>>> env = AtariEnv(name="ALE/MsPacman-v5", obs_type="ram")
>>> img = env.get_image()
>>> img.shape
(210, 160, 3)
Return type

numpy.ndarray

get_ram(self)[source]

Return a numpy array containing the content of the emulator’s RAM.

The RAM is a vector array interpreted as the memory of the emulator.

Example:

>>> env = AtariEnv(name="ALE/MsPacman-v5", obs_type="grayscale")
>>> ram = env.get_ram()
>>> ram.shape, ram.dtype
((128,), dtype('uint8'))
Return type

numpy.ndarray

init_gym_env(self)[source]

Initialize the gym.Env` instance that the Environment is wrapping.

Return type

gym.Env

get_state(self)[source]

Recover the internal state of the simulation.

If clone seed is False the environment will be stochastic. Cloning the full state ensures the environment is deterministic.

Example:

>>> env = AtariEnv(name="Qbert-v0")
>>> env.get_state() 
array([<ale_py._ale_py.ALEState object at 0x...>, None],
      dtype=object)

>>> env = AtariEnv(name="Qbert-v0", array_state=False)
>>> env.get_state() 
<ale_py._ale_py.ALEState object at 0x...>
Return type

numpy.ndarray

set_state(self, state)[source]

Set the internal state of the simulation.

Parameters

state (numpy.ndarray) – Target state to be set in the environment.

Return type

None

Example:

>>> env = AtariEnv(name="Qbert-v0")
>>> state, obs = env.reset()
>>> new_state, obs, reward, end, info = env.step(env.sample_action(), state=state)
>>> assert not (state == new_state).all()
>>> env.set_state(state)
>>> (state == env.get_state()).all()
True
step_with_dt(self, action, dt=1)[source]

Take dt simulation steps and make the environment evolve in multiples of self.frameskip for a total of dt * self.frameskip steps.

Parameters
  • action (Union[numpy.ndarray, int, float]) – Chosen action applied to the environment.

  • dt (int) – Consecutive number of times that the action will be applied.

Returns

If state is None return (observs, reward, terminal, info) else returns (new_state, observs, reward, terminal, info)

Example:

>>> env = AtariEnv(name="Pong-v0")
>>> obs = env.reset(return_state=False)
>>> obs, reward, end, info = env.step_with_dt(env.sample_action(), dt=7)
>>> assert not end
clone(self, **kwargs)[source]

Return a copy of the environment.

Return type

plangym.videogames.env.VideogameEnv

class plangym.videogames.atari.AtariPyEnvironment(name, frameskip=5, episodic_life=False, autoreset=True, delay_setup=False, remove_time_limit=True, obs_type='rgb', mode=0, difficulty=0, repeat_action_probability=0.0, full_action_space=False, render_mode=None, possible_to_win=False, wrappers=None, array_state=True, clone_seeds=False, **kwargs)[source]

Bases: AtariEnv

Create an environment to play OpenAI gym Atari Games that uses AtariPy as the emulator.

Parameters
  • name (str) –

  • frameskip (int) –

  • episodic_life (bool) –

  • autoreset (bool) –

  • delay_setup (bool) –

  • remove_time_limit (bool) –

  • obs_type (str) –

  • mode (int) –

  • difficulty (int) –

  • repeat_action_probability (float) –

  • full_action_space (bool) –

  • render_mode (Optional[str]) –

  • possible_to_win (bool) –

  • wrappers (Iterable[plangym.core.wrap_callable]) –

  • array_state (bool) –

  • clone_seeds (bool) –

get_state(self)[source]

Recover the internal state of the simulation.

If clone seed is False the environment will be stochastic. Cloning the full state ensures the environment is deterministic.

Return type

numpy.ndarray

set_state(self, state)[source]

Set the internal state of the simulation.

Parameters

state (numpy.ndarray) – Target state to be set in the environment.

Returns

None

Return type

None

get_ram(self)[source]

Return a numpy array containing the content of the emulator’s RAM.

The RAM is a vector array interpreted as the memory of the emulator.

Return type

numpy.ndarray