flatland.envs.rail_env module#
Definition of the RailEnv environment.
- class flatland.envs.rail_env.RailEnv(width, height, rail_generator: RailGenerator = None, line_generator: LineGenerator = None, number_of_agents=2, obs_builder_object: ~flatland.core.env_observation_builder.ObservationBuilder = <flatland.envs.observations.GlobalObsForRailEnv object>, malfunction_generator_and_process_data=None, malfunction_generator=None, remove_agents_at_target=True, random_seed=None, record_steps=False, timetable_generator=<function timetable_generator>)[source]#
Bases:
Environment
RailEnv environment class.
RailEnv is an environment inspired by a (simplified version of) a rail network, in which agents (trains) have to navigate to their target locations in the shortest time possible, while at the same time cooperating to avoid bottlenecks.
The valid actions in the environment are:
0: do nothing (continue moving or stay still)
1: turn left at switch and move to the next cell; if the agent was not moving, movement is started
2: move to the next cell in front of the agent; if the agent was not moving, movement is started
3: turn right at switch and move to the next cell; if the agent was not moving, movement is started
4: stop moving
Moving forward in a dead-end cell makes the agent turn 180 degrees and step to the cell it came from.
The actions of the agents are executed in order of their handle to prevent deadlocks and to allow them to learn relative priorities.
Stochastic malfunctioning of trains: Trains in RailEnv can malfunction if they are halted too often (either by their own choice or because an invalid action or cell is selected.
Every time an agent stops, an agent has a certain probability of malfunctioning. Malfunctions of trains follow a poisson process with a certain rate. Not all trains will be affected by malfunctions during episodes to keep complexity manageable.
TODO: currently, the parameters that control the stochasticity of the environment are hard-coded in init(). For Round 2, they will be passed to the constructor as arguments, to allow for more flexibility.
- action_required(is_cell_entry)[source]#
Check if an agent needs to provide an action
Parameters#
agent: RailEnvAgent Agent we want to check
Returns#
True: Agent needs to provide an action False: Agent cannot provide an action
- end_of_episode_update(have_all_agents_ended)[source]#
Updates made when episode ends Parameters: have_all_agents_ended - Indicates if all agents have reached done state
- generate_state_transition_signals(agent, preprocessed_action, movement_allowed)[source]#
Generate State Transitions Signals used in the state machine
- get_agent_handles() List[int] [source]#
Returns a list of agents’ handles to be used as keys in the step() function.
- get_info_dict()[source]#
Returns dictionary of infos for all agents dict_keys : action_required -
malfunction - Counter value for malfunction > 0 means train is in malfunction speed - Speed of the train state - State from the trains’s state machine
- get_valid_directions_on_grid(row: int, col: int) List[int] [source]#
Returns directions in which the agent can move
- initialize_renderer(mode, gl, agent_render_variant, show_debug, clear_debug_text, show, screen_height, screen_width)[source]#
- preprocess_action(action, agent)[source]#
- Preprocess the provided action
Change to DO_NOTHING if illegal action
Block all actions when in waiting state
Check MOVE_LEFT/MOVE_RIGHT actions on current position else try MOVE_FORWARD
- record_timestep(dActions)[source]#
Record the positions and orientations of all agents in memory, in the cur_episode
- render(mode='rgb_array', gl='PGL', agent_render_variant=AgentRenderVariant.ONE_STEP_BEHIND, show_debug=False, clear_debug_text=True, show=False, screen_height=600, screen_width=800, show_observations=False, show_predictions=False, show_rowcols=False, return_image=True)[source]#
Provides the option to render the environment’s behavior as an image or to a window. Parameters ———- mode
Returns#
Image if mode is rgb_array, opens a window otherwise
- reset(regenerate_rail, regenerate_schedule, activate_agents, random_seed)[source]#
The method resets the rail environment
Parameters#
- regenerate_railbool, optional
regenerate the rails
- regenerate_schedulebool, optional
regenerate the schedule and the static agents
- random_seedint, optional
random seed for environment
Returns#
- observation_dict: Dict
Dictionary with an observation for each agent
info_dict: Dict with agent specific information
- step(action_dict: Dict[int, RailEnvActions])[source]#
Updates rewards for the agents at a step.