plangym.vectorization.env
Plangym API implementation.
Module Contents
Classes
Base class that defines the API for working with vectorized environments. |
- class plangym.vectorization.env.VectorizedEnv(env_class, name, frameskip=1, autoreset=True, delay_setup=False, n_workers=8, **kwargs)[source]
Bases:
plangym.core.PlangymEnv
,abc.ABC
Base class that defines the API for working with vectorized environments.
A vectorized environment allows to step several copies of the environment in parallel when calling
step_batch
.It creates a local copy of the environment that is the target of all the other methods of
PlanEnv
. In practise, aVectorizedEnv
acts as a wrapper of an environment initialized with the provided parameters when calling __init__.- Parameters
name (str) –
frameskip (int) –
autoreset (bool) –
delay_setup (bool) –
n_workers (int) –
- property n_workers(self)
Return the number of parallel processes that run
step_batch
in parallel.- Return type
int
- property plan_env(self)
Environment that is wrapped by the current instance.
- Return type
- property obs_shape(self)
Tuple containing the shape of the observations returned by the Environment.
- Return type
Tuple[int]
- property action_shape(self)
Tuple containing the shape of the actions applied to the Environment.
- Return type
Tuple[int]
- property action_space(self)
Return the action_space of the environment.
- Return type
gym.spaces.Space
- property observation_space(self)
Return the observation_space of the environment.
- Return type
gym.spaces.Space
- property gym_env(self)
Return the instance of the environment that is being wrapped by plangym.
- static split_similar_chunks(vector, n_chunks)[source]
Split an indexable object into similar chunks.
- Parameters
vector (Union[list, numpy.ndarray]) – Target indexable object to be split.
n_chunks (int) – Number of similar chunks.
- Returns
Generator that returns the chunks created after splitting the target object.
- Return type
Generator[Union[list, numpy.ndarray], None, None]
- classmethod batch_step_data(cls, actions, states, dt, batch_size)[source]
Make batches of step data to distribute across workers.
- static unpack_transitions(results, return_states)[source]
Aggregate the results of stepping across diferent workers.
- Parameters
results (list) –
return_states (bool) –
- create_env_callable(self, **kwargs)[source]
Return a callable that initializes the environment that is being vectorized.
- Return type
Callable[Ellipsis, plangym.core.PlanEnv]
- setup(self)[source]
Initialize the target environment with the parameters provided at __init__.
- Return type
None
- step(self, action, state=None, dt=1, return_state=None)[source]
Step the environment applying a given action from an arbitrary state.
If is not provided the signature matches the step method from OpenAI gym.
- Parameters
action (numpy.ndarray) – Array containing the action to be applied.
state (numpy.ndarray) – State to be set before stepping the environment.
dt (int) – Consecutive number of times to apply the given action.
return_state (bool) – Whether to return the state in the returned tuple. If None, step will return the state if state was passed as a parameter.
- Returns
if states is None returns (observs, rewards, ends, infos) else (new_states, observs, rewards, ends, infos).
- reset(self, return_state=True)[source]
Reset the environment and returns the first observation, or the first (state, obs) tuple.
- Parameters
return_state (bool) – If true return a also the initial state of the env.
- Returns
Observation of the environment if return_state is False. Otherwise, return (state, obs) after reset.
- get_state(self)[source]
Recover the internal state of the simulation.
A state completely describes the Environment at a given moment.
- Returns
State of the simulation.
- set_state(self, state)[source]
Set the internal state of the simulation.
- Parameters
state – Target state to be set in the environment.
- render(self, mode='human')[source]
Render the environment using OpenGL. This wraps the OpenAI render method.
- get_image(self)[source]
Return a numpy array containing the rendered view of the environment.
Square matrices are interpreted as a greyscale image. Three-dimensional arrays are interpreted as RGB images with channels (Height, Width, RGB)
- Return type
numpy.ndarray
- step_with_dt(self, action, dt=1)[source]
Take
dt
simulation steps and make the environment evolve in multiples ofself.frameskip
for a total ofdt
*self.frameskip
steps.- Parameters
action (Union[numpy.ndarray, int, float]) – Chosen action applied to the environment.
dt (int) – Consecutive number of times that the action will be applied.
- Returns
If state is None returns (observs, reward, terminal, info) else returns (new_state, observs, reward, terminal, info).
- Return type
tuple
- sample_action(self)[source]
Return a valid action that can be used to step the Environment.
Implementing this method is optional, and it’s only intended to make the testing process of the Environment easier.
- step_batch(self, actions, states=None, dt=1, return_state=None)[source]
Vectorized version of the
step
method.It allows to step a vector of states and actions. The signature and behaviour is the same as
step
, but taking a list of states, actions and dts as input.- Parameters
actions (numpy.ndarray) – Iterable containing the different actions to be applied.
states (numpy.ndarray) – Iterable containing the different states to be set.
dt (Union[numpy.ndarray, int]) – int or array containing the frameskips that will be applied.
return_state (bool) – Whether to return the state in the returned tuple. If None, step will return the state if state was passed as a parameter.
- Returns
if states is None returns (observs, rewards, ends, infos) else (new_states, observs, rewards, ends, infos).