src.plangym.core
================

.. py:module:: src.plangym.core

.. autoapi-nested-parse::

   Plangym API implementation.


Attributes
----------

.. autoapisummary::

   src.plangym.core.wrap_callable


Classes
-------

.. autoapisummary::

   src.plangym.core.PlanEnv
   src.plangym.core.PlangymEnv


Module Contents
---------------

.. py:data:: wrap_callable

.. py:class:: PlanEnv(name, frameskip = 1, autoreset = True, delay_setup = False, return_image = False)

   Bases: :py:obj:`abc.ABC`


   Inherit from this class to adapt environments to different problems.

   Base class that establishes all needed methods and blueprints to work with
   Gym environments.


   .. py:attribute:: STATE_IS_ARRAY
      :value: True


   .. py:attribute:: OBS_IS_ARRAY
      :value: True


   .. py:attribute:: SINGLETON
      :value: False


   .. py:attribute:: _name


   .. py:attribute:: frameskip


   .. py:attribute:: autoreset


   .. py:attribute:: delay_setup


   .. py:attribute:: _return_image


   .. py:attribute:: _n_step
      :value: 0


   .. py:attribute:: _obs_step
      :value: None


   .. py:attribute:: _reward_step
      :value: 0


   .. py:attribute:: _terminal_step
      :value: False


   .. py:attribute:: _truncated_step
      :value: False


   .. py:attribute:: _info_step


   .. py:attribute:: _action_step
      :value: None


   .. py:attribute:: _dt_step
      :value: None


   .. py:attribute:: _state_step
      :value: None


   .. py:attribute:: _return_state_step
      :value: None


   .. py:method:: __del__()

      Teardown the Environment when it is no longer needed.


   .. py:property:: name
      :type: str

      Return is the name of the environment.


   .. py:property:: obs_shape
      :type: tuple[int]

      :abstractmethod:

      Tuple containing the shape of the observations returned by the Environment.


   .. py:property:: action_shape
      :type: tuple[int]

      :abstractmethod:

      Tuple containing the shape of the actions applied to the Environment.


   .. py:property:: unwrapped
      :type: PlanEnv

      Completely unwrap this Environment.

      Returns
          plangym.Environment: The base non-wrapped plangym.Environment instance


   .. py:property:: return_image
      :type: bool

      Return `return_image` flag.

      If ``True`` add an "rgb" key in the `info` dictionary returned by `step`         that contains an RGB representation of the environment state.


   .. py:property:: img_shape
      :type: tuple[int, Ellipsis] | None

      Return the shape of the image returned by the environment.

      If the environment does not return an image, it will return None. This also applies
      to environments that throw an error when trying to get the image
      (like when running in headless machines without a virtual display).


   .. py:method:: get_image()
      :abstractmethod:


      Return a numpy array containing the rendered view of the environment.

      Square matrices are interpreted as a grayscale image. Three-dimensional arrays
      are interpreted as RGB images with channels (Height, Width, RGB)


   .. py:method:: step(action, state = None, dt = 1, return_state = None)

      Step the environment applying the supplied action.

      Optionally set the state to the supplied state before stepping it (the
      method prepares the environment in the given state, dismissing the current
      state, and applies the action afterwards).

      Take ``dt`` simulation steps and make the environment evolve in multiples         of ``self.frameskip`` for a total of ``dt`` * ``self.frameskip`` steps.

      In addition, the method allows the user to prepare the returned object,
      adding additional information and custom pre-processings via ``self.process_step``
      and ``self.get_step_tuple`` methods.

      :param action: Chosen action applied to the environment.
      :param state: Set the environment to the given state before stepping it.
      :param dt: Consecutive number of times that the action will be applied.
      :param return_state: Whether to return the state in the returned tuple.                 If None, `step` will return the state if `state` was passed as a parameter.

      :returns: if state is None returns ``(observs, reward, terminal, info)``
                else returns ``(new_state, observs, reward, terminal, info)``


   .. py:method:: reset(return_state = True)

      Restart the environment.

      :param return_state: If ``True``, it will return the state of the environment.

      :returns: ``(state, obs)`` if ```return_state`` is ``True`` else return ``obs``.


   .. py:method:: step_batch(actions, states = None, dt = 1, return_state = True)

      Allow stepping a vector of states and actions.

      Vectorized version of the `step` method. The signature and behaviour is
      the same as `step`, but taking a list of states, actions and dts as input.

      :param actions: Iterable containing the different actions to be applied.
      :param states: Iterable containing the different states to be set.
      :param dt: int or array containing the consecutive that will be applied to each state.
                 If array, the different values are distributed among the multiple environments
                 (contrary to ``self.frameskip``, which is a common value for any instance).
      :param return_state: Whether to return the state in the returned tuple, depending on
                           the boolean value.                 If None, `step` will return the state if `state` was passed as a parameter.

      :returns: If return_state is `True`, the method returns `(new_states, observs, rewards, ends,
                infos)`.             If return_state is `False`, the method returns `(observs, rewards, ends, infos)`.             If return_state is `None`, the returned object depends on the states parameter.


   .. py:method:: clone(**kwargs)

      Return a copy of the environment.


   .. py:method:: sample_action()

      Return a valid action that can be used to step the Environment.

      Implementing this method is optional, and it's only intended to make the
      testing process of the Environment easier.


   .. py:method:: step_with_dt(action, dt = 1)

      Step the environment applying the supplied action dt times.

      Take ``dt`` simulation steps and make the environment evolve in multiples        of ``self.frameskip`` for a total of ``dt`` * ``self.frameskip`` steps.

      The method performs any post-processing to the data after applying the action
      to the environment via ``self.process_apply_action``.

      This method neither computes nor returns any state.

      :param action: Chosen action applied to the environment.
      :param dt: Consecutive number of times that the action will be applied.

      :returns: Tuple containing ``(observs, reward, terminal, info)``.


   .. py:method:: run_autoreset(step_data)

      Reset the environment automatically if needed.


   .. py:method:: get_step_tuple(obs, reward, terminal, truncated, info)

      Prepare the tuple that step returns.

      This is a post processing state to have fine-grained control over what data         the current step is returning.

      By default it determines:
       - Return the state in the tuple (necessary information to save or load the game).
       - Adding the "rgb" key in the `info` dictionary containing an RGB          representation of the environment.

      :param obs: Observation of the environment.
      :param reward: Reward signal.
      :param terminal: Boolean indicating if the environment is finished.
      :param info: Dictionary containing additional information about the environment.
      :param truncated: Boolean indicating if the environment was truncated.

      :returns: Tuple containing the environment data after calling `step`.


   .. py:method:: setup()

      Run environment initialization.

      Including in this function all the code which makes the environment impossible
      to serialize will allow to dispatch the environment to different workers and
      initialize it once it's copied to the target process.


   .. py:method:: begin_step(action=None, dt=None, state=None, return_state = None)

      Perform setup of step variables before starting `step_with_dt`.


   .. py:method:: process_apply_action(obs, reward, terminal, truncated, info)

      Perform any post-processing to the data returned by `apply_action`.

      :param obs: Observation of the environment.
      :param reward: Reward signal.
      :param terminal: Boolean indicating if the environment is finished.
      :param info: Dictionary containing additional information about the environment.
      :param truncated: Boolean indicating if the environment was truncated.

      :returns: Tuple containing the processed data.


   .. py:method:: process_step(obs, reward, terminal, truncated, info)

      Prepare the returned info dictionary.

      This is a post processing step to have fine-grained control over what data         the info dictionary contains.

      :param obs: Observation of the environment.
      :param reward: Reward signal.
      :param terminal: Boolean indicating if the environment is finished.
      :param info: Dictionary containing additional information about the environment.
      :param truncated: Boolean indicating if the environment was truncated.

      :returns: Tuple containing the environment data after calling `step`.


   .. py:method:: close()

      Tear down the current environment.


   .. py:method:: process_obs(obs, **kwargs)

      Perform optional computation for computing the observation returned by step.


   .. py:method:: process_reward(reward, **kwargs)

      Perform optional computation for computing the reward returned by step.


   .. py:method:: process_terminal(terminal, **kwargs)

      Perform optional computation for computing the terminal flag returned by step.


   .. py:method:: process_info(info, **kwargs)

      Perform optional computation for computing the info dictionary returned by step.


   .. py:method:: apply_action(action)
      :abstractmethod:


      Evolve the environment for one time step applying the provided action.


   .. py:method:: apply_reset(**kwargs)
      :abstractmethod:


      Perform the resetting operation on the environment.


   .. py:method:: get_state()
      :abstractmethod:


      Recover the internal state of the simulation.

      A state must completely describe the Environment at a given moment.


   .. py:method:: set_state(state)
      :abstractmethod:


      Set the internal state of the simulation. Overwrite current state by the given argument.

      :param state: Target state to be set in the environment.

      :returns: None


.. py:class:: PlangymEnv(name, frameskip = 1, autoreset = True, wrappers = None, delay_setup = False, remove_time_limit = True, render_mode = 'rgb_array', episodic_life=False, obs_type=None, return_image=False, **kwargs)

   Bases: :py:obj:`PlanEnv`


   Base class for implementing OpenAI ``gym`` environments in ``plangym``.


   .. py:attribute:: AVAILABLE_RENDER_MODES


   .. py:attribute:: AVAILABLE_OBS_TYPES


   .. py:attribute:: DEFAULT_OBS_TYPE
      :value: 'coords'


   .. py:property:: render_mode
      :type: None | str

      None | human | rgb_array.

      :type: Return how the game will be rendered. Values


   .. py:attribute:: _render_mode


   .. py:attribute:: _gym_env
      :value: None


   .. py:attribute:: _gym_env_kwargs


   .. py:attribute:: _remove_time_limit


   .. py:attribute:: _wrappers


   .. py:attribute:: _obs_space
      :value: None


   .. py:attribute:: _action_space
      :value: None


   .. py:attribute:: _obs_type


   .. py:method:: __str__()

      Pretty print the environment.


   .. py:method:: __repr__()

      Pretty print the environment.


   .. py:property:: gym_env
      Return the instance of the environment that is being wrapped by plangym.


   .. py:property:: obs_shape
      :type: tuple[int, Ellipsis] | None

      Tuple containing the shape of the *observations* returned by the Environment.


   .. py:property:: obs_type
      :type: str

      Return the *type* of observation returned by the environment.


   .. py:property:: observation_space
      :type: gymnasium.spaces.Space

      Return the *observation_space* of the environment.


   .. py:property:: action_shape
      :type: tuple[int, Ellipsis]

      Tuple containing the shape of the *actions* applied to the Environment.


   .. py:property:: action_space
      :type: gymnasium.spaces.Space

      Return the *action_space* of the environment.


   .. py:property:: reward_range
      Return the *reward_range* of the environment.


   .. py:property:: metadata
      Return the *metadata* of the environment.


   .. py:property:: remove_time_limit
      :type: bool

      Return True if the Environment can only be stepped for a limited number of times.


   .. py:method:: setup()

      Initialize the target :class:`gym.Env` instance.

      The method calls ``self.init_gym_env`` to initialize the :class:``gym.Env`` instance.
      It removes time limits if needed and applies wrappers introduced by the user.


   .. py:method:: init_spaces()

      Initialize the action_space and observation_space of the environment.


   .. py:method:: _init_action_space()


   .. py:method:: _init_obs_space_rgb()


   .. py:method:: _init_obs_space_grayscale()


   .. py:method:: _init_obs_space_coords()


   .. py:method:: get_image()

      Return a numpy array containing the rendered view of the environment.

      Square matrices are interpreted as a greyscale image. Three-dimensional arrays
      are interpreted as RGB images with channels (Height, Width, RGB).


   .. py:method:: apply_reset()

      Restart the environment.

      Returns
          ``(obs, info)``. If ```return_image`` is ``True``, the info dictionary contains an
          ``'rgb'`` key with the corresponding image.


   .. py:method:: apply_action(action)

      Evolve the environment for one time step applying the provided action.

      Accumulate rewards and calculate terminal flag after stepping the environment.


   .. py:method:: sample_action()

      Return a valid action that can be used to step the environment chosen at random.


   .. py:method:: clone(**kwargs)

      Return a copy of the environment.


   .. py:method:: close()

      Close the underlying :class:`gym.Env`.


   .. py:method:: init_gym_env()

      Initialize the :class:``gym.Env`` instance that the current class is wrapping.


   .. py:method:: seed(seed=None)

      Seed the underlying :class:`gym.Env`.


   .. py:method:: apply_wrappers(wrappers)

      Wrap the underlying OpenAI gym environment.


   .. py:method:: wrap(wrapper, *args, **kwargs)

      Apply a single OpenAI gym wrapper to the environment.


   .. py:method:: render()

      Render the environment using OpenGL. This wraps the OpenAI render method.


   .. py:method:: process_obs(obs, **kwargs)

      Perform optional computation for computing the observation returned by step.

      This is a post processing step to have fine-grained control over the returned
      observation.


   .. py:method:: get_coords_obs(obs, **kwargs)

      Calculate the observation returned by `step` when obs_type == "coords".


   .. py:method:: get_rgb_obs(obs, **kwargs)

      Calculate the observation returned by `step` when obs_type == "rgb".


   .. py:method:: get_grayscale_obs(obs, **kwargs)

      Calculate the observation returned by `step` when obs_type == "grayscale".