flatland.trajectories.policy_runner module

flatland.trajectories.policy_runner module#

class flatland.trajectories.policy_runner.PolicyRunner[source]#

Bases: object

static create_from_policy(policy: Policy, data_dir: Path, env: RailEnv | None = None, n_agents=7, x_dim=30, y_dim=30, n_cities=2, max_rail_pairs_in_city=4, grid_mode=False, max_rails_between_cities=2, malfunction_duration_min=20, malfunction_duration_max=50, malfunction_interval=540, speed_ratios=None, seed=42, obs_builder: ObservationBuilder | None = None, snapshot_interval: int = 1, ep_id: str | None = None, callbacks: FlatlandCallbacks | None = None, tqdm_kwargs: dict | None = None, start_step: int = 0, end_step: int | None = None, fork_from_trajectory: Trajectory | None = None) → Trajectory[source]#

Creates trajectory by running submission (policy and obs builder).

Always backs up the actions and positions for steps executed in the tsvs. Can start from existing trajectory.

Parameters#

policyPolicy: the submission’s policy
data_dirPath: the path to write the trajectory to
env: RailEnv: directly inject env, skip env generation
n_agents: int: number of agents
x_dim: int: number of columns
y_dim: int: number of rows
n_cities: int: Max number of cities to build. The generator tries to achieve this numbers given all the parameters. Goes into sparse_rail_generator.
max_rail_pairs_in_city: int: Number of parallel tracks in the city. This represents the number of tracks in the train stations. Goes into sparse_rail_generator.
grid_mode: bool: How to distribute the cities in the path, either equally in a grid or random. Goes into sparse_rail_generator.
max_rails_between_cities: int: Max number of rails connecting to a city. This is only the number of connection points at city boarder.
malfunction_duration_min: int: Minimal duration of malfunction. Goes into ParamMalfunctionGen.
malfunction_duration_max: int: Max duration of malfunction. Goes into ParamMalfunctionGen.
malfunction_interval: int: Inverse of rate of malfunction occurrence. Goes into ParamMalfunctionGen.
speed_ratios: Dict[float, float]: Speed ratios of all agents. They are probabilities of all different speeds and have to add up to 1. Goes into sparse_line_generator. Defaults to {1.0: 0.25, 0.5: 0.25, 0.33: 0.25, 0.25: 0.25}.
seed: int: Initiate random seed generators. Goes into reset.
obs_builder: Optional[ObservationBuilder]: Defaults to TreeObsForRailEnv(max_depth=3, predictor=ShortestPathPredictorForRailEnv(max_depth=50))
snapshot_intervalint: interval to write pkl snapshots
ep_id: str: episode ID to store data under. If not provided, generate one.
callbacks: FlatlandCallbacks: callbacks to run during trajectory creation
tqdm_kwargs: dict: additional kwargs for tqdm
start_stepint: start evaluation from intermediate step incl. (requires snapshot to be present); take actions from start_step and first step executed is start_step + 1. Defaults to 0 with first elapsed step 1.
end_stepint: stop evaluation at intermediate step excl. Capped by env’s max_episode_steps
fork_from_trajectoryTrajectory: copy data from this trajectory up to start step and run policy from there on

Returns#

Trajectory

flatland.trajectories.policy_runner module

Contents

flatland.trajectories.policy_runner module#

Parameters#

Returns#