plangym.control.dm_control

Implement the plangym API for dm_control environments.

Module Contents

Classes

DMControlEnv

Wrap the `dm_control library, allowing its implementation in planning problems.

Attributes

novideo_mode

plangym.control.dm_control.novideo_mode = False
class plangym.control.dm_control.DMControlEnv(name='cartpole-balance', frameskip=1, episodic_life=False, autoreset=True, wrappers=None, delay_setup=False, visualize_reward=True, domain_name=None, task_name=None, render_mode=None, obs_type=None, remove_time_limit=None)[source]

Bases: plangym.core.PlangymEnv

Wrap the `dm_control library, allowing its implementation in planning problems.

The dm_control library is a DeepMind’s software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo physics.

For more information about the environment, please refer to https://github.com/deepmind/dm_control

This class allows the implementation of dm_control in planning problems. It allows parallel and vectorized execution of the environments.

Parameters
  • name (str) –

  • frameskip (int) –

  • episodic_life (bool) –

  • autoreset (bool) –

  • wrappers (Iterable[plangym.core.wrap_callable]) –

  • delay_setup (bool) –

  • visualize_reward (bool) –

  • obs_type (Optional[str]) –

DEFAULT_OBS_TYPE = coords
property physics(self)

Alias for gym_env.physics.

property domain_name(self)

Return the name of the agent in the current simulation.

Return type

str

property task_name(self)

Return the name of the task in the current simulation.

Return type

str

static _parse_names(name, domain_name, task_name)[source]

Return the name, domain name, and task name of the project.

init_gym_env(self)[source]

Initialize the environment instance (dm_control) that the current class is wrapping.

setup(self)[source]

Initialize the target gym.Env instance.

_init_action_space(self)[source]

Define the action space of the environment.

This method determines the spectrum of possible actions that the agent can perform. The action space consists in a grid representing the Cartesian product of the closed intervals defined by the user.

_init_obs_space_coords(self)[source]

Define the observation space of the environment.

action_spec(self)[source]

Alias for the environment’s action_spec.

get_image(self)[source]

Return a numpy array containing the rendered view of the environment.

Square matrices are interpreted as a greyscale image. Three-dimensional arrays are interpreted as RGB images with channels (Height, Width, RGB).

Return type

numpy.ndarray

render(self, mode='human')[source]

Store all the RGB images rendered to be shown when the show_game function is called.

Parameters

modergb_array return an RGB image stored in a numpy array. human stores the rendered image in a viewer to be shown when show_game is called.

Returns

numpy.ndarray when mode == rgb_array. True when mode == human

show_game(self, sleep=0.05)[source]

Render the collected RGB images.

When ‘human’ option is selected as argument for the render method, it stores a collection of RGB images inside the self.viewer attribute. This method calls the latter to visualize the collected images.

Parameters

sleep (float) –

get_coords_obs(self, obs, **kwargs)[source]

Get the environment observation from a time_step object.

Parameters
  • obs – Time step object returned after stepping the environment.

  • **kwargs – Ignored

Returns

Numpy array containing the environment observation.

Return type

numpy.ndarray

set_state(self, state)[source]

Set the state of the simulator to the target State.

Parameters

state (numpy.ndarray) – numpy.ndarray containing the information about the state to be set.

Returns

None

Return type

None

get_state(self)[source]

Return a tuple containing the three arrays that characterize the state of the system.

Each tuple contains the position of the robot, its velocity

and the control variables currently being applied.

Returns

Tuple of numpy arrays containing all the information needed to describe the current state of the simulation.

Return type

numpy.ndarray

apply_action(self, action)[source]

Transform the returned time_step object to a compatible gym tuple.

static _time_step_to_obs(time_step)[source]

Stack observation values as a horizontal sequence.

Concat observations in a single array, making easier calculating distances.

Return type

numpy.ndarray

close(self)[source]

Tear down the environment and close rendering.