flatland.envs.rail_env module

flatland.envs.rail_env module#

Definition of the RailEnv environment.

class flatland.envs.rail_env.RailEnv(width, height, rail_generator: RailGenerator = None, line_generator: LineGenerator = None, number_of_agents=2, obs_builder_object: ~flatland.core.env_observation_builder.ObservationBuilder = <flatland.envs.observations.GlobalObsForRailEnv object>, malfunction_generator_and_process_data=None, malfunction_generator: MalfunctionGenerator = None, remove_agents_at_target=True, random_seed=None, record_steps=False, timetable_generator=<function timetable_generator>, acceleration_delta=1.0, braking_delta=-1.0, rewards: ~flatland.envs.rewards.Rewards = None, effects_generator: ~flatland.core.effects_generator.EffectsGenerator[RailEnv] = None)[source]#

Bases: Environment

RailEnv environment class.

RailEnv is an environment inspired by a (simplified version of) a rail network, in which agents (trains) have to navigate to their target locations in the shortest time possible, while at the same time cooperating to avoid bottlenecks.

The valid actions in the environment are:

0: do nothing (continue moving or stay still)

1: turn left at switch and move to the next cell; if the agent was not moving, movement is started

2: move to the next cell in front of the agent; if the agent was not moving, movement is started

3: turn right at switch and move to the next cell; if the agent was not moving, movement is started

4: stop moving

Moving forward in a dead-end cell makes the agent turn 180 degrees and step to the cell it came from.

In order for agents to be able to “understand” the simulation behaviour from the observations, the execution order of actions should not matter (i.e. not depend on the agent handle). However, the agent ordering is still used to resolve conflicts between two agents trying to move into the same cell, for example, head-on collisions, or agents “merging” at junctions. See MotionCheck for more details.

Stochastic malfunctioning of trains: Trains in RailEnv can malfunction if they are halted too often (either by their own choice or because an invalid action or cell is selected.

Every time an agent stops, an agent has a certain probability of malfunctioning. Malfunctions of trains follow a poisson process with a certain rate. Not all trains will be affected by malfunctions during episodes to keep complexity manageable.

TODO: currently, the parameters that control the stochasticity of the environment are hard-coded in init(). For Round 2, they will be passed to the constructor as arguments, to allow for more flexibility.

action_required(is_cell_entry)[source]#: Check if an agent needs to provide an action

Parameters#

agent: RailEnvAgent Agent we want to check

Returns#

True: Agent needs to provide an action False: Agent cannot provide an action

add_agent(agent)[source]#: Add static info for a single agent. Returns the index of the new agent.

clear_rewards_dict()[source]#: Reset the rewards dictionary

clone_from(env: RailEnv, obs_builder: ObservationBuilder[Any] | None = None)[source]#

Clone an environment by resetting the simulation to its current state. See enlite-ai/maze

Parameters#

envEnvironment: the env to clone into self.

close()[source]#: Closes any renderer window.

end_of_episode_update(have_all_agents_ended)[source]#: Updates made when episode ends Parameters: have_all_agents_ended - Indicates if all agents have reached done state

get_agent_handles() → List[int][source]#: Returns a list of agents’ handles to be used as keys in the step() function.

get_info_dict()[source]#: Returns dictionary of infos for all agents dict_keys : action_required -

malfunction - Counter value for malfunction > 0 means train is in malfunction speed - Speed of the train state - State from the trains’s state machine

get_num_agents() → int[source]#

get_valid_directions_on_grid(row: int, col: int) → List[int][source]#: Returns directions in which the agent can move

handle_done_state(agent)[source]#: Any updates to agent to be made in Done state

initialize_renderer(mode, gl, agent_render_variant, show_debug, clear_debug_text, show, screen_height, screen_width)[source]#

record_timestep(dActions)[source]#: Record the positions and orientations of all agents in memory, in the cur_episode

render(mode='rgb_array', gl='PGL', agent_render_variant=AgentRenderVariant.ONE_STEP_BEHIND, show_debug=False, clear_debug_text=True, show=False, screen_height=600, screen_width=800, show_observations=False, show_predictions=False, show_rowcols=False, return_image=True)[source]#: Provides the option to render the environment’s behavior as an image or to a window. Parameters ———- mode

Returns#

Image if mode is rgb_array, opens a window otherwise

reset(regenerate_rail, regenerate_schedule, activate_agents, random_seed)[source]#

The method resets the rail environment

Parameters#

regenerate_railbool, optional: regenerate the rails
regenerate_schedulebool, optional: regenerate the schedule and the static agents
random_seedint, optional: random seed for environment

Returns#

observation_dict: Dict: Dictionary with an observation for each agent

info_dict: Dict with agent specific information

reset_agents()[source]#: Reset the agents to their starting positions

save(filename)[source]#

step(action_dict: Dict[int, RailEnvActions])[source]#: Updates rewards for the agents at a step.

update_renderer(mode, show, show_observations, show_predictions, show_rowcols, return_image)[source]#: This method updates the render. Parameters ———- mode

Returns#

Image if mode is rgb_array, None otherwise

flatland.envs.rail_env module

Contents

flatland.envs.rail_env module#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

Returns#