src.plangym.vectorization.env

src.plangym.vectorization.env#

Plangym API implementation.

Classes#

VectorizedEnv

Base class that defines the API for working with vectorized environments.

Module Contents#

class src.plangym.vectorization.env.VectorizedEnv(env_class, name, frameskip=1, autoreset=True, delay_setup=False, n_workers=8, **kwargs)[source]#

Bases: plangym.core.PlangymEnv, abc.ABC

Base class that defines the API for working with vectorized environments.

A vectorized environment allows to step several copies of the environment in parallel when calling step_batch.

It creates a local copy of the environment that is the target of all the other methods of PlanEnv. In practise, a VectorizedEnv acts as a wrapper of an environment initialized with the provided parameters when calling __init__.

Parameters:

name (str)
frameskip (int)
autoreset (bool)
delay_setup (bool)
n_workers (int)

_n_workers#

_env_class#

_env_kwargs#

_plangym_env: plangym.core.PlangymEnv | plangym.core.PlanEnv | None = None#

SINGLETON#

STATE_IS_ARRAY#

property n_workers: int#

Return the number of parallel processes that run step_batch in parallel.

Return type:: int

property plan_env: plangym.core.PlanEnv#

Environment that is wrapped by the current instance.

Return type:: plangym.core.PlanEnv

property obs_shape: tuple[int]#

Tuple containing the shape of the observations returned by the Environment.

Return type:: tuple[int]

property action_shape: tuple[int]#

Tuple containing the shape of the actions applied to the Environment.

Return type:: tuple[int]

property action_space: gymnasium.spaces.Space#

Return the action_space of the environment.

Return type:: gymnasium.spaces.Space

property observation_space: gymnasium.spaces.Space#

Return the observation_space of the environment.

Return type:: gymnasium.spaces.Space

property gym_env#
Return the instance of the environment that is being wrapped by plangym.

__getattr__(item)[source]#: Forward attributes to the wrapped environment.

static split_similar_chunks(vector, n_chunks)[source]#

Split an indexable object into similar chunks.

Parameters:

vector (list | numpy.ndarray) – Target indexable object to be split.
n_chunks (int) – Number of similar chunks.

Returns:

Generator that returns the chunks created after splitting the target object.

Return type:

Generator[list | numpy.ndarray, None, None]

classmethod batch_step_data(actions, states, dt, batch_size)[source]#: Make batches of step data to distribute across workers.

static unpack_transitions(results, return_states)[source]#

Aggregate the results of stepping across diferent workers.

Parameters:

results (list)
return_states (bool)

create_env_callable(**kwargs)[source]#

Return a callable that initializes the environment that is being vectorized.

Return type:: Callable[Ellipsis, plangym.core.PlanEnv]

setup()[source]#

Initialize the target environment with the parameters provided at __init__.

Return type:: None

step(action, state=None, dt=1, return_state=None)[source]#

Step the environment applying a given action from an arbitrary state.

If is not provided the signature matches the step method from OpenAI gym.

Parameters:

action (numpy.ndarray) – Array containing the action to be applied.
state (numpy.ndarray) – State to be set before stepping the environment.
dt (int) – Consecutive number of times to apply the given action.
return_state (bool | None) – Whether to return the state in the returned tuple. If None, step will return the state if state was passed as a parameter.

Returns:

if states is None returns (observs, rewards, ends, infos) else (new_states, observs, rewards, ends, infos).

reset(return_state=True)[source]#

Reset the environment.

Reset the environment and returns the first observation, or the first (state, obs, info) tuple.

Parameters:: return_state (bool) – If true return a also the initial state of the env.
Returns:: Observation of the environment if return_state is False. Otherwise, return (state, obs) after reset.

get_state()[source]#

Recover the internal state of the simulation.

A state completely describes the Environment at a given moment.

Returns: State of the simulation.

set_state(state)[source]#

Set the internal state of the simulation.

Parameters:: state – Target state to be set in the environment.

render(mode='human')[source]#: Render the environment using OpenGL. This wraps the OpenAI render method.

get_image()[source]#

Return a numpy array containing the rendered view of the environment.

Square matrices are interpreted as a greyscale image. Three-dimensional arrays are interpreted as RGB images with channels (Height, Width, RGB)

Return type:: numpy.ndarray

step_with_dt(action, dt=1)[source]#

Step the environment dt times with the same action.

Take dt simulation steps and make the environment evolve in multiples of self.frameskip for a total of dt * self.frameskip steps.

Parameters:

action (numpy.ndarray | int | float) – Chosen action applied to the environment.
dt (int) – Consecutive number of times that the action will be applied.

Returns:

If state is None returns (observs, reward, terminal, info) else returns (new_state, observs, reward, terminal, info).

Return type:

tuple

sample_action()[source]#

Return a valid action that can be used to step the Environment.

Implementing this method is optional, and it’s only intended to make the testing process of the Environment easier.

step_batch(actions, states=None, dt=1, return_state=None)[source]#

Vectorized version of the step method.

It allows to step a vector of states and actions. The signature and behaviour is the same as step, but taking a list of states, actions and dts as input.

Parameters:

actions (numpy.ndarray) – Iterable containing the different actions to be applied.
states (numpy.ndarray) – Iterable containing the different states to be set.
dt (numpy.ndarray | int) – int or array containing the frameskips that will be applied.
return_state (bool | None) – Whether to return the state in the returned tuple. If None, step will return the state if state was passed as a parameter.

Returns:

if states is None returns (observs, rewards, ends, infos) else (new_states, observs, rewards, ends, infos).

clone(**kwargs)[source]#

Return a copy of the environment.

Return type:: plangym.core.PlanEnv

abstract sync_states(state)[source]#

Synchronize the workers’ states with the state of self.gym_env.

Set all the states of the different workers of the internal BatchEnv to the same state as the internal Environment used to apply the non-vectorized steps.

Parameters:: state (None)

abstract make_transitions(actions, states, dt, return_state=None)[source]#

Implement the logic for stepping the environment in parallel.

Parameters:: return_state (bool | None)