plangym.vectorization

Module that contains the code implementing vectorization for PlangymEnv.step_batch.

Submodules

Package Contents

Classes

ParallelEnv

Allow any environment to be stepped in parallel when step_batch is called.

RayEnv

Use ray for taking steps in parallel when calling step_batch.

class plangym.vectorization.ParallelEnv(env_class, name, frameskip=1, autoreset=True, delay_setup=False, n_workers=8, blocking=False, **kwargs)[source]

Bases: plangym.vectorization.env.VectorizedEnv

Allow any environment to be stepped in parallel when step_batch is called.

It creates a local instance of the target environment to call all other methods.

Example:

>>> from plangym.videogames import AtariEnv
>>> env = ParallelEnv(env_class=AtariEnv,
...                           name="MsPacman-v0",
...                           clone_seeds=True,
...                           autoreset=True,
...                           blocking=False)
>>>
>>> state, obs = env.reset()
>>>
>>> states = [state.copy() for _ in range(10)]
>>> actions = [env.sample_action() for _ in range(10)]
>>>
>>> data =  env.step_batch(states=states, actions=actions)
>>> new_states, observs, rewards, ends, infos = data
Parameters
  • name (str) –

  • frameskip (int) –

  • autoreset (bool) –

  • delay_setup (bool) –

  • n_workers (int) –

  • blocking (bool) –

property blocking(self)

If True the steps are performed sequentially.

Return type

bool

setup(self)[source]

Run environment initialization and create the subprocesses for stepping in parallel.

clone(self, **kwargs)[source]

Return a copy of the environment.

Return type

plangym.core.PlanEnv

make_transitions(self, actions, states=None, dt=1, return_state=None)[source]

Vectorized version of the step method.

It allows to step a vector of states and actions. The signature and behaviour is the same as step, but taking a list of states, actions and dts as input.

Parameters
  • actions (numpy.ndarray) – Iterable containing the different actions to be applied.

  • states (numpy.ndarray) – Iterable containing the different states to be set.

  • dt (Union[numpy.ndarray, int]) – int or array containing the frameskips that will be applied.

  • return_state (bool) – Whether to return the state in the returned tuple. If None, step will return the state if state was passed as a parameter.

Returns

if states is None returns (observs, rewards, ends, infos) else (new_states, observs, rewards, ends, infos)

sync_states(self, state)[source]

Synchronize all the copies of the wrapped environment.

Set all the states of the different workers of the internal BatchEnv

to the same state as the internal Environment used to apply the non-vectorized steps.

Parameters

state (None) –

close(self)[source]

Close the environment and the spawned processes.

Return type

None

class plangym.vectorization.RayEnv(env_class, name, frameskip=1, autoreset=True, delay_setup=False, n_workers=8, **kwargs)[source]

Bases: plangym.vectorization.env.VectorizedEnv

Use ray for taking steps in parallel when calling step_batch.

Parameters
  • name (str) –

  • frameskip (int) –

  • autoreset (bool) –

  • delay_setup (bool) –

  • n_workers (int) –

property workers(self)

Remote actors exposing copies of the environment.

Return type

List[RemoteEnv]

setup(self)[source]

Run environment initialization and create the subprocesses for stepping in parallel.

make_transitions(self, actions, states=None, dt=1, return_state=None)[source]

Implement the logic for stepping the environment in parallel.

Parameters
  • dt ([numpy.ndarray, int]) –

  • return_state (bool) –

reset(self, return_state=True)[source]

Restart the environment.

Parameters

return_state (bool) –

Return type

[numpy.ndarray, tuple]

sync_states(self, state)[source]

Synchronize all the copies of the wrapped environment.

Set all the states of the different workers of the internal BatchEnv

to the same state as the internal Environment used to apply the non-vectorized steps.

Parameters

state (None) –

Return type

None