src.plangym.control.dm_control

src.plangym.control.dm_control#

Implement the plangym API for dm_control environments.

Attributes#

novideo_mode

Classes#

DMControlEnv

Wrap the `dm_control library, allowing its implementation in planning problems.

Module Contents#

src.plangym.control.dm_control.novideo_mode = False#

class src.plangym.control.dm_control.DMControlEnv(name='cartpole-balance', frameskip=1, episodic_life=False, autoreset=True, wrappers=None, delay_setup=False, visualize_reward=True, domain_name=None, task_name=None, render_mode='rgb_array', obs_type=None, remove_time_limit=None, return_image=False)[source]#

Bases: plangym.core.PlangymEnv

Wrap the `dm_control library, allowing its implementation in planning problems.

The dm_control library is a DeepMind’s software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo physics.

For more information about the environment, please refer to deepmind/dm_control

This class allows the implementation of dm_control in planning problems. It allows parallel and vectorized execution of the environments.

Parameters:

name (str)
frameskip (int)
episodic_life (bool)
autoreset (bool)
wrappers (Iterable[plangym.core.wrap_callable] | None)
delay_setup (bool)
visualize_reward (bool)
obs_type (str | None)
return_image (bool)

DEFAULT_OBS_TYPE = 'coords'#

_visualize_reward#

viewer = []#

_viewer = None#

property physics#
Alias for gym_env.physics.

property domain_name: str#

Return the name of the agent in the current simulation.

Return type:: str

property task_name: str#

Return the name of the task in the current simulation.

Return type:: str

static _parse_names(name, domain_name, task_name)[source]#: Return the name, domain name, and task name of the project.

init_gym_env()[source]#: Initialize the environment instance (dm_control) that the current class is wrapping.

setup()[source]#: Initialize the target gym.Env instance.

_init_action_space()[source]#

Define the action space of the environment.

This method determines the spectrum of possible actions that the agent can perform. The action space consists in a grid representing the Cartesian product of the closed intervals defined by the user.

_init_obs_space_coords()[source]#: Define the observation space of the environment.

action_spec()[source]#: Alias for the environment’s action_spec.

get_image()[source]#

Return a numpy array containing the rendered view of the environment.

Square matrices are interpreted as a greyscale image. Three-dimensional arrays are interpreted as RGB images with channels (Height, Width, RGB).

Return type:: numpy.ndarray

render(mode=None)[source]#

Render the environment.

Store all the RGB images rendered to be shown when the show_game function is called.

Parameters:: mode – rgb_array return an RGB image stored in a numpy array. human stores the rendered image in a viewer to be shown when show_game is called.
Returns:: numpy.ndarray when mode == rgb_array. True when mode == human

show_game(sleep=0.05)[source]#

Render the collected RGB images.

When ‘human’ option is selected as argument for the render method, it stores a collection of RGB images inside the self.viewer attribute. This method calls the latter to visualize the collected images.

Parameters:: sleep (float)

get_coords_obs(obs, **kwargs)[source]#

Get the environment observation from a time_step object.

Parameters:

obs – Time step object returned after stepping the environment.
**kwargs – Ignored

Returns:

Numpy array containing the environment observation.

Return type:

numpy.ndarray

set_state(state)[source]#

Set the state of the simulator to the target State.

Parameters:: state (numpy.ndarray) – numpy.ndarray containing the information about the state to be set.
Returns:: None
Return type:: None

get_state()[source]#

Return the state of the environment.

Return a tuple containing the three arrays that characterize the state of the system.

Each tuple contains the position of the robot, its velocity: and the control variables currently being applied.
Returns: Tuple of numpy arrays containing all the information needed to describe the current state of the simulation.

Return type:: numpy.ndarray

apply_action(action)[source]#: Transform the returned time_step object to a compatible gym tuple.

static _time_step_to_obs(time_step)[source]#

Stack observation values as a horizontal sequence.

Concat observations in a single array, making easier calculating distances.

Return type:: numpy.ndarray

close()[source]#: Tear down the environment and close rendering.