flatland.envs.rail_env module#
Definition of the RailEnv environment.
- class flatland.envs.rail_env.AbstractRailEnv(rail_generator: RailGenerator = None, line_generator: LineGenerator = None, number_of_agents=2, obs_builder_object: ~flatland.core.env_observation_builder.ObservationBuilder = <flatland.envs.observations.GlobalObsForRailEnv object>, malfunction_generator_and_process_data=None, malfunction_generator: MalfunctionGenerator = None, remove_agents_at_target=True, random_seed=None, timetable_generator=<function timetable_generator>, acceleration_delta: ~fractions.Fraction = Fraction(1, 1), braking_delta: ~fractions.Fraction = Fraction(-1, 1), rewards: ~flatland.envs.rewards.Rewards = None, effects_generator: ~flatland.core.effects_generator.EffectsGenerator[RailEnv] = None, distance_map: ~flatland.core.distance_map.AbstractDistanceMap = None)[source]#
Bases:
Environment,Generic[UnderlyingTransitionMapType,UnderlyingResourceMapType,ConfigurationType]AbstractRailEnv environment class.
RailEnv is an environment inspired by a (simplified version of) a rail network, in which agents (trains) have to navigate to their target locations in the shortest time possible, while at the same time cooperating to avoid bottlenecks.
The valid actions in the environment are:
0: do nothing (continue moving or stay still)
1: turn left at switch and move to the next cell; if the agent was not moving, movement is started
2: move to the next cell in front of the agent; if the agent was not moving, movement is started
3: turn right at switch and move to the next cell; if the agent was not moving, movement is started
4: stop moving
Moving forward in a dead-end cell makes the agent turn 180 degrees and step to the cell it came from.
In order for agents to be able to “understand” the simulation behaviour from the observations, the execution order of actions should not matter (i.e. not depend on the agent handle). However, the agent ordering is still used to resolve conflicts between two agents trying to move into the same cell, for example, head-on collisions, or agents “merging” at junctions. See MotionCheck for more details.
Stochastic malfunctioning of trains: Trains in RailEnv can malfunction if they are halted too often (either by their own choice or because an invalid action or cell is selected.
Every time an agent stops, an agent has a certain probability of malfunctioning. Malfunctions of trains follow a poisson process with a certain rate. Not all trains will be affected by malfunctions during episodes to keep complexity manageable.
TODO: currently, the parameters that control the stochasticity of the environment are hard-coded in init(). For Round 2, they will be passed to the constructor as arguments, to allow for more flexibility.
Parameters#
- rail_generatorfunction
The rail_generator function is a function that takes the width, height and agent handles of a rail environment, along with the number of times the env has been reset, and returns a GridTransitionMap object and a list of starting positions, targets, and initial orientations for agent handles. The rail_generator can pass a distance map in the hints or information for specific line_generators. Implementations can be found in flatland/envs/rail_generators.py
- line_generatorfunction
The line_generator function is a function that takes the grid, the number of agents and optional hints and returns a list of starting positions, targets, initial orientations and maximum speeds for all agent handles. Implementations can be found in flatland/envs/line_generators.py
- number_of_agentsint
Number of agents to spawn on the map. Potentially in the future, a range of number of agents to sample from.
- obs_builder_object: ObservationBuilder
ObservationBuilder-derived object that builds observation vectors for each agent.
- malfunction_generator_and_process_dataTuple[“MalfunctionGenerator”,”MalfunctionProcessData”]
Deprecated. Use malfunction_generator option instead.
- malfunction_generator: “MalfunctionGenerator”
Convenience option to inject effects generator. Defaults to NoMalfunctionGen.
- remove_agents_at_targetbool
If remove_agents_at_target is set to true then the agents will be removed by placing to RailEnv.DEPOT_POSITION when the agent has reached its target position.
- random_seedint or None
if None, then it is ignored, else the random generators are seeded with this number to ensure that stochastic operations are replicable across multiple operations
- timetable_generator
Timetable generator to be used in reset(). Defaults to “ttg.timetable_generator”.
- acceleration_deltafloat
Determines how much speed is increased by MOVE_FORWARD action up to max_speed set by train’s Line (sampled from speed_ratios by LineGenerator). As speed is between 0.0 and 1.0, acceleration_delta=1.0 restores the previous constant speed behaviour (i.e. MOVE_FORWARD always sets to max speed allowed for train).
- braking_deltafloat
Determines how much speed is decreased by STOP_MOVING action. As speed is between 0.0 and 1.0, braking_delta=-1.0 restores to previous full stop behaviour.
- rewardsDefaultRewards
The rewards function to use. Defaults to standard settings of Flatland 3 behaviour.
- effects_generatorOptional[EffectsGenerator[“RailEnv”]]
The effects generator that can modify the env at the end of env reset, at the beginning of the env step and at the end of the env step.
- distance_map: AbstractDistanceMap
Use pre-computed distance map. Defaults to new distance map.
- action_required(is_cell_entry)[source]#
Check if an agent needs to provide an action
Parameters#
agent: RailEnvAgent Agent we want to check
Returns#
True: Agent needs to provide an action False: Agent cannot provide an action
- end_of_episode_update(have_all_agents_ended)[source]#
Updates made when episode ends Parameters: have_all_agents_ended - Indicates if all agents have reached done state
- get_agent_handles() List[int][source]#
Returns a list of agents’ handles to be used as keys in the step() function.
- get_info_dict()[source]#
Returns dictionary of infos for all agents dict_keys : action_required -
malfunction - Counter value for malfunction > 0 means train is in malfunction speed - Speed of the train state - State from the trains’s state machine
- reset(regenerate_rail, regenerate_schedule, activate_agents, random_seed)[source]#
The method resets the rail environment
Parameters#
- regenerate_railbool, optional
regenerate the rails
- regenerate_schedulebool, optional
regenerate the schedule and the static agents
- random_seedint, optional
random seed for environment
Returns#
- observation_dict: Dict
Dictionary with an observation for each agent
info_dict: Dict with agent specific information
- step(action_dict: Dict[int, RailEnvActions])[source]#
Updates rewards for the agents at a step.
- class flatland.envs.rail_env.RailEnv(width, height, rail_generator: RailGenerator = None, line_generator: LineGenerator = None, number_of_agents=2, obs_builder_object: ~flatland.core.env_observation_builder.ObservationBuilder = <flatland.envs.observations.GlobalObsForRailEnv object>, malfunction_generator_and_process_data=None, malfunction_generator: MalfunctionGenerator = None, remove_agents_at_target=True, random_seed=None, record_steps=False, timetable_generator=<function timetable_generator>, acceleration_delta=1.0, braking_delta=-1.0, rewards: ~flatland.envs.rewards.Rewards = None, effects_generator: ~flatland.core.effects_generator.EffectsGenerator[RailEnv] = None)[source]#
Bases:
AbstractRailEnv[GridTransitionMap,GridResourceMap,Tuple[Tuple[int,int],int]]- clone_from(env: RailEnv, obs_builder: ObservationBuilder[RailEnv, Any] | None = None)[source]#
Clone an environment by resetting the simulation to its current state. See enlite-ai/maze
Parameters#
- envEnvironment
the env to clone into self.
- record_timestep(dActions)[source]#
Record the positions and orientations of all agents in memory, in the cur_episode
- step(action_dict: Dict[int, RailEnvActions])[source]#
Updates rewards for the agents at a step.