flatland.envs.rail_env module#

Definition of the RailEnv environment.

class flatland.envs.rail_env.RailEnv(width, height, rail_generator: RailGenerator = None, line_generator: LineGenerator = None, number_of_agents=2, obs_builder_object: ~flatland.core.env_observation_builder.ObservationBuilder = <flatland.envs.observations.GlobalObsForRailEnv object>, malfunction_generator_and_process_data=None, malfunction_generator=None, remove_agents_at_target=True, random_seed=None, record_steps=False, timetable_generator=<function timetable_generator>)[source]#

Bases: Environment

RailEnv environment class.

RailEnv is an environment inspired by a (simplified version of) a rail network, in which agents (trains) have to navigate to their target locations in the shortest time possible, while at the same time cooperating to avoid bottlenecks.

The valid actions in the environment are:

  • 0: do nothing (continue moving or stay still)

  • 1: turn left at switch and move to the next cell; if the agent was not moving, movement is started

  • 2: move to the next cell in front of the agent; if the agent was not moving, movement is started

  • 3: turn right at switch and move to the next cell; if the agent was not moving, movement is started

  • 4: stop moving

Moving forward in a dead-end cell makes the agent turn 180 degrees and step to the cell it came from.

The actions of the agents are executed in order of their handle to prevent deadlocks and to allow them to learn relative priorities.

Stochastic malfunctioning of trains: Trains in RailEnv can malfunction if they are halted too often (either by their own choice or because an invalid action or cell is selected.

Every time an agent stops, an agent has a certain probability of malfunctioning. Malfunctions of trains follow a poisson process with a certain rate. Not all trains will be affected by malfunctions during episodes to keep complexity manageable.

TODO: currently, the parameters that control the stochasticity of the environment are hard-coded in init(). For Round 2, they will be passed to the constructor as arguments, to allow for more flexibility.

action_required(is_cell_entry)[source]#

Check if an agent needs to provide an action

Parameters#

agent: RailEnvAgent Agent we want to check

Returns#

True: Agent needs to provide an action False: Agent cannot provide an action

add_agent(agent)[source]#

Add static info for a single agent. Returns the index of the new agent.

clear_rewards_dict()[source]#

Reset the rewards dictionary

close()[source]#

Closes any renderer window.

end_of_episode_update(have_all_agents_ended)[source]#

Updates made when episode ends Parameters: have_all_agents_ended - Indicates if all agents have reached done state

generate_state_transition_signals(agent, preprocessed_action, movement_allowed)[source]#

Generate State Transitions Signals used in the state machine

get_agent_handles() List[int][source]#

Returns a list of agents’ handles to be used as keys in the step() function.

get_info_dict()[source]#

Returns dictionary of infos for all agents dict_keys : action_required -

malfunction - Counter value for malfunction > 0 means train is in malfunction speed - Speed of the train state - State from the trains’s state machine

get_num_agents() int[source]#
get_valid_directions_on_grid(row: int, col: int) List[int][source]#

Returns directions in which the agent can move

handle_done_state(agent)[source]#

Any updates to agent to be made in Done state

initialize_renderer(mode, gl, agent_render_variant, show_debug, clear_debug_text, show, screen_height, screen_width)[source]#
preprocess_action(action, agent)[source]#
Preprocess the provided action
  • Change to DO_NOTHING if illegal action

  • Block all actions when in waiting state

  • Check MOVE_LEFT/MOVE_RIGHT actions on current position else try MOVE_FORWARD

record_timestep(dActions)[source]#

Record the positions and orientations of all agents in memory, in the cur_episode

render(mode='rgb_array', gl='PGL', agent_render_variant=AgentRenderVariant.ONE_STEP_BEHIND, show_debug=False, clear_debug_text=True, show=False, screen_height=600, screen_width=800, show_observations=False, show_predictions=False, show_rowcols=False, return_image=True)[source]#

Provides the option to render the environment’s behavior as an image or to a window. Parameters ———- mode

Returns#

Image if mode is rgb_array, opens a window otherwise

reset(regenerate_rail, regenerate_schedule, activate_agents, random_seed)[source]#

The method resets the rail environment

Parameters#

regenerate_railbool, optional

regenerate the rails

regenerate_schedulebool, optional

regenerate the schedule and the static agents

random_seedint, optional

random seed for environment

Returns#

observation_dict: Dict

Dictionary with an observation for each agent

info_dict: Dict with agent specific information

reset_agents()[source]#

Reset the agents to their starting positions

save(filename)[source]#
step(action_dict: Dict[int, RailEnvActions])[source]#

Updates rewards for the agents at a step.

update_renderer(mode, show, show_observations, show_predictions, show_rowcols, return_image)[source]#

This method updates the render. Parameters ———- mode

Returns#

Image if mode is rgb_array, None otherwise

update_step_rewards(i_agent)[source]#

Update the rewards dict for agent id i_agent for every timestep