plangym.control

Module that contains environments representing control tasks.

Submodules

Package Contents

Classes

BalloonEnv

This class implements the 'BalloonLearningEnvironment-v0' released by Google in the balloon_learning_environment.

Box2DEnv

Common interface for working with Box2D environments released by gym.

ClassicControl

Environment for OpenAI gym classic control environments.

DMControlEnv

Wrap the `dm_control library, allowing its implementation in planning problems.

LunarLander

Fast LunarLander that follows the plangym API.

class plangym.control.BalloonEnv(name='BalloonLearningEnvironment-v0', renderer=None, array_state=True, **kwargs)[source]

Bases: plangym.core.PlangymEnv

This class implements the ‘BalloonLearningEnvironment-v0’ released by Google in the balloon_learning_environment.

For more information about the environment, please refer to https://github.com/google/balloon-learning-environment.

Parameters
  • name (str) –

  • array_state (bool) –

AVAILABLE_RENDER_MODES
AVAILABLE_OBS_TYPES
STATE_IS_ARRAY = False
get_state(self)[source]

Get the state of the environment.

Return type

Any

set_state(self, state)[source]

Set the state of the environment.

Parameters

state (Any) –

Return type

None

seed(self, seed=None)[source]

Ignore seeding until next release.

Parameters

seed (int) –

class plangym.control.Box2DEnv(name, frameskip=1, autoreset=True, wrappers=None, delay_setup=False, remove_time_limit=True, render_mode=None, episodic_life=False, obs_type=None, return_image=False, **kwargs)[source]

Bases: plangym.core.PlangymEnv

Common interface for working with Box2D environments released by gym.

Parameters
  • name (str) –

  • frameskip (int) –

  • autoreset (bool) –

  • wrappers (Iterable[wrap_callable]) –

  • delay_setup (bool) –

  • render_mode (Optional[str]) –

get_state(self)[source]

Recover the internal state of the simulation.

A state must completely describe the Environment at a given moment.

Return type

numpy.array

set_state(self, state)[source]

Set the internal state of the simulation.

Parameters

state (numpy.ndarray) – Target state to be set in the environment.

Returns

None

Return type

None

class plangym.control.ClassicControl(name, frameskip=1, autoreset=True, wrappers=None, delay_setup=False, remove_time_limit=True, render_mode=None, episodic_life=False, obs_type=None, return_image=False, **kwargs)[source]

Bases: plangym.core.PlangymEnv

Environment for OpenAI gym classic control environments.

Parameters
  • name (str) –

  • frameskip (int) –

  • autoreset (bool) –

  • wrappers (Iterable[wrap_callable]) –

  • delay_setup (bool) –

  • render_mode (Optional[str]) –

get_state(self)[source]

Recover the internal state of the environment.

Return type

numpy.ndarray

set_state(self, state)[source]

Set the internal state of the environemnt.

Parameters

state (numpy.ndarray) – Target state to be set in the environment.

Returns

None

class plangym.control.DMControlEnv(name='cartpole-balance', frameskip=1, episodic_life=False, autoreset=True, wrappers=None, delay_setup=False, visualize_reward=True, domain_name=None, task_name=None, render_mode=None, obs_type=None, remove_time_limit=None)[source]

Bases: plangym.core.PlangymEnv

Wrap the `dm_control library, allowing its implementation in planning problems.

The dm_control library is a DeepMind’s software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo physics.

For more information about the environment, please refer to https://github.com/deepmind/dm_control

This class allows the implementation of dm_control in planning problems. It allows parallel and vectorized execution of the environments.

Parameters
  • name (str) –

  • frameskip (int) –

  • episodic_life (bool) –

  • autoreset (bool) –

  • wrappers (Iterable[plangym.core.wrap_callable]) –

  • delay_setup (bool) –

  • visualize_reward (bool) –

  • obs_type (Optional[str]) –

DEFAULT_OBS_TYPE = coords
property physics(self)

Alias for gym_env.physics.

property domain_name(self)

Return the name of the agent in the current simulation.

Return type

str

property task_name(self)

Return the name of the task in the current simulation.

Return type

str

static _parse_names(name, domain_name, task_name)[source]

Return the name, domain name, and task name of the project.

init_gym_env(self)[source]

Initialize the environment instance (dm_control) that the current class is wrapping.

setup(self)[source]

Initialize the target gym.Env instance.

_init_action_space(self)[source]

Define the action space of the environment.

This method determines the spectrum of possible actions that the agent can perform. The action space consists in a grid representing the Cartesian product of the closed intervals defined by the user.

_init_obs_space_coords(self)[source]

Define the observation space of the environment.

action_spec(self)[source]

Alias for the environment’s action_spec.

get_image(self)[source]

Return a numpy array containing the rendered view of the environment.

Square matrices are interpreted as a greyscale image. Three-dimensional arrays are interpreted as RGB images with channels (Height, Width, RGB).

Return type

numpy.ndarray

render(self, mode='human')[source]

Store all the RGB images rendered to be shown when the show_game function is called.

Parameters

modergb_array return an RGB image stored in a numpy array. human stores the rendered image in a viewer to be shown when show_game is called.

Returns

numpy.ndarray when mode == rgb_array. True when mode == human

show_game(self, sleep=0.05)[source]

Render the collected RGB images.

When ‘human’ option is selected as argument for the render method, it stores a collection of RGB images inside the self.viewer attribute. This method calls the latter to visualize the collected images.

Parameters

sleep (float) –

get_coords_obs(self, obs, **kwargs)[source]

Get the environment observation from a time_step object.

Parameters
  • obs – Time step object returned after stepping the environment.

  • **kwargs – Ignored

Returns

Numpy array containing the environment observation.

Return type

numpy.ndarray

set_state(self, state)[source]

Set the state of the simulator to the target State.

Parameters

state (numpy.ndarray) – numpy.ndarray containing the information about the state to be set.

Returns

None

Return type

None

get_state(self)[source]

Return a tuple containing the three arrays that characterize the state of the system.

Each tuple contains the position of the robot, its velocity

and the control variables currently being applied.

Returns

Tuple of numpy arrays containing all the information needed to describe the current state of the simulation.

Return type

numpy.ndarray

apply_action(self, action)[source]

Transform the returned time_step object to a compatible gym tuple.

static _time_step_to_obs(time_step)[source]

Stack observation values as a horizontal sequence.

Concat observations in a single array, making easier calculating distances.

Return type

numpy.ndarray

close(self)[source]

Tear down the environment and close rendering.

class plangym.control.LunarLander(name=None, frameskip=1, episodic_life=True, autoreset=True, wrappers=None, delay_setup=False, deterministic=False, continuous=False, render_mode=None, remove_time_limit=None, **kwargs)[source]

Bases: plangym.core.PlangymEnv

Fast LunarLander that follows the plangym API.

Parameters
  • name (str) –

  • frameskip (int) –

  • episodic_life (bool) –

  • autoreset (bool) –

  • wrappers (Iterable[plangym.core.wrap_callable]) –

  • delay_setup (bool) –

  • deterministic (bool) –

  • continuous (bool) –

  • render_mode (Optional[str]) –

property deterministic(self)

Return true if the LunarLander simulation is deterministic.

Return type

bool

property continuous(self)

Return true if the LunarLander agent takes continuous actions as input.

Return type

bool

init_gym_env(self)[source]

Initialize the target gym.Env instance.

Return type

FastGymLunarLander

get_state(self)[source]

Recover the internal state of the simulation.

An state must completely describe the Environment at a given moment.

Return type

numpy.ndarray

set_state(self, state)[source]

Set the internal state of the simulation.

Parameters

state (numpy.ndarray) – Target state to be set in the environment.

Returns

None

Return type

None

process_terminal(self, terminal, obs=None, **kwargs)[source]

Return the terminal condition considering the lunar lander state.

Return type

bool