plangym.vectorization.env

Plangym API implementation.

Module Contents

Classes

VectorizedEnv

Base class that defines the API for working with vectorized environments.

class plangym.vectorization.env.VectorizedEnv(env_class, name, frameskip=1, autoreset=True, delay_setup=False, n_workers=8, **kwargs)[source]

Bases: plangym.core.PlangymEnv, abc.ABC

Base class that defines the API for working with vectorized environments.

A vectorized environment allows to step several copies of the environment in parallel when calling step_batch.

It creates a local copy of the environment that is the target of all the other methods of PlanEnv. In practise, a VectorizedEnv acts as a wrapper of an environment initialized with the provided parameters when calling __init__.

Parameters
  • name (str) –

  • frameskip (int) –

  • autoreset (bool) –

  • delay_setup (bool) –

  • n_workers (int) –

property n_workers(self)

Return the number of parallel processes that run step_batch in parallel.

Return type

int

property plan_env(self)

Environment that is wrapped by the current instance.

Return type

plangym.core.PlanEnv

property obs_shape(self)

Tuple containing the shape of the observations returned by the Environment.

Return type

Tuple[int]

property action_shape(self)

Tuple containing the shape of the actions applied to the Environment.

Return type

Tuple[int]

property action_space(self)

Return the action_space of the environment.

Return type

gym.spaces.Space

property observation_space(self)

Return the observation_space of the environment.

Return type

gym.spaces.Space

property gym_env(self)

Return the instance of the environment that is being wrapped by plangym.

__getattr__(self, item)[source]

Forward attributes to the wrapped environment.

static split_similar_chunks(vector, n_chunks)[source]

Split an indexable object into similar chunks.

Parameters
  • vector (Union[list, numpy.ndarray]) – Target indexable object to be split.

  • n_chunks (int) – Number of similar chunks.

Returns

Generator that returns the chunks created after splitting the target object.

Return type

Generator[Union[list, numpy.ndarray], None, None]

classmethod batch_step_data(cls, actions, states, dt, batch_size)[source]

Make batches of step data to distribute across workers.

static unpack_transitions(results, return_states)[source]

Aggregate the results of stepping across diferent workers.

Parameters
  • results (list) –

  • return_states (bool) –

create_env_callable(self, **kwargs)[source]

Return a callable that initializes the environment that is being vectorized.

Return type

Callable[Ellipsis, plangym.core.PlanEnv]

setup(self)[source]

Initialize the target environment with the parameters provided at __init__.

Return type

None

step(self, action, state=None, dt=1, return_state=None)[source]

Step the environment applying a given action from an arbitrary state.

If is not provided the signature matches the step method from OpenAI gym.

Parameters
  • action (numpy.ndarray) – Array containing the action to be applied.

  • state (numpy.ndarray) – State to be set before stepping the environment.

  • dt (int) – Consecutive number of times to apply the given action.

  • return_state (bool) – Whether to return the state in the returned tuple. If None, step will return the state if state was passed as a parameter.

Returns

if states is None returns (observs, rewards, ends, infos) else (new_states, observs, rewards, ends, infos).

reset(self, return_state=True)[source]

Reset the environment and returns the first observation, or the first (state, obs) tuple.

Parameters

return_state (bool) – If true return a also the initial state of the env.

Returns

Observation of the environment if return_state is False. Otherwise, return (state, obs) after reset.

get_state(self)[source]

Recover the internal state of the simulation.

A state completely describes the Environment at a given moment.

Returns

State of the simulation.

set_state(self, state)[source]

Set the internal state of the simulation.

Parameters

state – Target state to be set in the environment.

render(self, mode='human')[source]

Render the environment using OpenGL. This wraps the OpenAI render method.

get_image(self)[source]

Return a numpy array containing the rendered view of the environment.

Square matrices are interpreted as a greyscale image. Three-dimensional arrays are interpreted as RGB images with channels (Height, Width, RGB)

Return type

numpy.ndarray

step_with_dt(self, action, dt=1)[source]

Take dt simulation steps and make the environment evolve in multiples of self.frameskip for a total of dt * self.frameskip steps.

Parameters
  • action (Union[numpy.ndarray, int, float]) – Chosen action applied to the environment.

  • dt (int) – Consecutive number of times that the action will be applied.

Returns

If state is None returns (observs, reward, terminal, info) else returns (new_state, observs, reward, terminal, info).

Return type

tuple

sample_action(self)[source]

Return a valid action that can be used to step the Environment.

Implementing this method is optional, and it’s only intended to make the testing process of the Environment easier.

step_batch(self, actions, states=None, dt=1, return_state=None)[source]

Vectorized version of the step method.

It allows to step a vector of states and actions. The signature and behaviour is the same as step, but taking a list of states, actions and dts as input.

Parameters
  • actions (numpy.ndarray) – Iterable containing the different actions to be applied.

  • states (numpy.ndarray) – Iterable containing the different states to be set.

  • dt (Union[numpy.ndarray, int]) – int or array containing the frameskips that will be applied.

  • return_state (bool) – Whether to return the state in the returned tuple. If None, step will return the state if state was passed as a parameter.

Returns

if states is None returns (observs, rewards, ends, infos) else (new_states, observs, rewards, ends, infos).

clone(self, **kwargs)[source]

Return a copy of the environment.

Return type

plangym.core.PlanEnv

abstract sync_states(self, state)[source]

Synchronize the workers’ states with the state of self.gym_env.

Set all the states of the different workers of the internal BatchEnv to the same state as the internal Environment used to apply the non-vectorized steps.

Parameters

state (None) –

abstract make_transitions(self, actions, states, dt, return_state=None)[source]

Implement the logic for stepping the environment in parallel.

Parameters

return_state (bool) –