Enviroment creation

To create a new enviroment with a robotic platform and a task the user must create a class in which the robot and task is defined. To facilitate many of the required steps needed to create an env (enviroment) following the OpenAI Gym documentation FRobs_RL already has a class which implements most of the required functions needed by Gym. The class is called RobotBasicEnv and can be inherited by any enviroment regardless of the robot or task. The main functions already implemented in RobotBasicEnv are:

step: This is the funcion that will be executed in each step of the RL loop.
reset: The funcion called when the enviroment must be reset, wheter it may be because of the sucess or because the time limit has been reached.
close: The funcion that is executed when the enviroment is closed. It mainly makes sure that all ROS Nodes and the Gazebo simulator are properly closed.

To create a new enviroment the user must inherit the RobotBasicEnv and fill the next functions:

_send_action: The function used to send the commands to the robot.
_get_observation: The function to execute when observations from the enviroment are needed.
_get_reward: The function that calculates and returns the reward based on the action of the agent.
_check_if_done: The function to check if the robot sucess finished the task or reached the goal.
_check_subs_and_pubs_connection (Optional): The function that check if the ROS subscribers and publishers are connected and properly receiving or sending messages.
_set_episode_init_params (Optional): The function to set ROS or Gazebo initial parameters for the episode, e.g.: the initial position of the robot, the location of obstacles, etc.

To create a new enviroment we recommend that the user creates two different classes a CustomRobotEnv and a CustomTaskEnv, in this way the principal funcions are separted in the following way:

All processes related to the robot are located in the CustomRobotEnv, this can be the URDF model loading, controllers spawning, spawning the robot in the simulator, etc.
The CustomTaskEnv inherits the CustomRobotEnv and are where all processes related directly to the task are implemented, this can be the way to send actions to the robot agent, the process to obtain observations, the reward funcion, etc.

The previous arquitecture has the advantage that a CustomRobotEnv can be reused in many tasks, reducing the amount of code needed to create a new enviroment

In the next step a guide will be shown to how to use the templates of the CustomRobotEnv and CustomTaskEnv classes included in the FRobs_RL library.

Note

Althought the previous separation in CustomRobotEnv and CustomTaskEnv is recommended, the user can program the enviroment in any way inheriting the RobotBasicEnv.

class robot_BasicEnv.RobotBasicEnv(launch_gazebo=False, gazebo_init_paused=True, gazebo_use_gui=True, gazebo_recording=False, gazebo_freq=100, world_path=None, world_pkg=None, world_filename=None, gazebo_max_freq=None, gazebo_timestep=None, spawn_robot=False, model_name_in_gazebo='robot', namespace='/robot', pkg_name=None, urdf_file=None, urdf_folder='/urdf', controller_file=None, controller_list=None, urdf_xacro_args=None, rob_state_publisher_max_freq=None, rob_st_term=False, model_pos_x=0.0, model_pos_y=0.0, model_pos_z=0.0, model_ori_x=0.0, model_ori_y=0.0, model_ori_z=0.0, model_ori_w=0.0, reset_controllers=False, reset_mode=1, step_mode=1, num_gazebo_steps=1)[source]

Basic enviroment for all the robot environments in the frobs_rl library. To use a custom world, one can use two options: 1) set the path directly to the world file (world_path) or set the pkg name and world filename (world_pkg and world_filename).

Parameters

launch_gazebo (bool) – If True, launch Gazebo at the start of the env.
gazebo_init_paused (bool) – If True, Gazebo is initialized in a paused state.
gazebo_use_gui (bool) – If True, Gazebo is launched with a GUI (through gzclient).
gazebo_recording (bool) – If True, Gazebo is launched with a recording of the GUI (through gzclient).
gazebo_freq (int) – The publish rate of gazebo in Hz.
world_path (str) – If using a custom world then the path to the world.
world_pkg (str) – If using a custom world then the package name of the world.
world_filename (str) – If using a custom world then the filename of the world.
gazebo_max_freq (float) – max update rate for gazebo in real time factor: 1 is real time, 10 is 10 times real time.
gazebo_timestep (float) – The timestep of gazebo in seconds.
spawn_robot (bool) – If True, the robot is spawned in the environment.
model_name_in_gazebo (str) – The name of the model in gazebo.
namespace (str) – The namespace of the robot.
pkg_name (str) – The package name where the robot model is located.
urdf_file (str) – The path to the urdf file of the robot.
urdf_folder (str) – The path to the folder where the urdf files are located. Default is “/urdf”.
urdf_xacro_args (str) – The arguments to be passed to the xacro parser.
controller_file (str) – The path to the controllers YAML file of the robot.
controller_list (list of str) – The list of controllers to be launched.
rob_state_publisher_max_freq (int) – The maximum frequency of the ros state publisher.
rob_st_term (bool) – If True, the robot state publisher is launched in a new terminal.
model_pos_x – The x position of the robot in the world.
model_pos_y – The y position of the robot in the world.
model_pos_z – The z position of the robot in the world.
model_ori_x – The x orientation of the robot in the world.
model_ori_y – The y orientation of the robot in the world.
model_ori_z – The z orientation of the robot in the world.
model_ori_w – The w orientation of the robot in the world.
reset_controllers (bool) – If True, the controllers are reset at the start of each episode.
reset_mode – If 1, reset Gazebo with a “reset_world” (Does not reset time) If 2, reset Gazebo with a “reset_simulation” (Resets time)
step_mode – If 1, step Gazebo using the “pause_physics” and “unpause_physics” services. If 2, step Gazebo using the “step_simulation” command.
num_gazebo_steps – If using step_mode 2, the number of steps to be taken.

_check_if_done()[source]

Function to check if the episode is done.

If the episode has a success condition then set done as:: self.info[‘is_success’] = 1.0

_check_subs_and_pubs_connection()[source]: Function to check if the gazebo and ros connections are ready

_get_observation()[source]: Function to get the observation from the enviroment.

_get_reward()[source]: Function to get the reward from the enviroment.

_reset_gazebo()[source]: Function to reset the gazebo simulation.

_send_action(action)[source]: Function to send an action to the robot

_set_episode_init_params()[source]: Function to set some parameters, like the position of the robot, at the begining of each episode.

close()[source]: Function to close the environment when training is done.

reset()[source]: Function to reset the enviroment after an episode is done.

step(action)[source]: Function to send an action to the robot and get the observation and reward.