flatland.evaluators.service module#

class flatland.evaluators.service.FlatlandRemoteEvaluationService(test_env_folder=None, flatland_rl_service_id='T12345', remote_host='127.0.0.1', remote_port=6379, remote_db=0, remote_password=None, visualize=False, video_generation_envs=[], report=None, verbose=False, action_dir=None, episode_dir=None, analysis_data_dir=None, merge_dir=None, use_pickle=False, shuffle=False, missing_only=False, result_output_path=None, disable_timeouts=False)[source]#

Bases: object

A remote evaluation service which exposes the following interfaces of a RailEnv : - env_create - env_step and an additional env_submit to cater to score computation and on-episode-complete post-processings.

This service is designed to be used in conjunction with FlatlandRemoteClient and both the service and client maintain a local instance of the RailEnv instance, and in case of any unexpected divergences in the state of both the instances, the local RailEnv instance of the FlatlandRemoteEvaluationService is supposed to act as the single source of truth.

Both the client and remote service communicate with each other via Redis as a message broker. The individual messages are packed and unpacked with msgpack (a patched version of msgpack which also supports numpy arrays).

collect_analysis_data()[source]#

Collect data at the END of an episode. Data to be saved in a json file corresponding to the episode.

compute_mean_scores()[source]#
delete_key_in_running_stats(key)[source]#

This deletes a particular key in the running stats dictionary, if it exists

get_env_filepaths()[source]#

Gathers a list of all available rail env files to be used for evaluation. The folder structure expected at the test_env_folder is similar to :

. ├── Test_0 │   ├── Level_1.pkl │   ├── ……. │   ├── ……. │   └── Level_99.pkl └── Test_1

├── Level_1.pkl ├── ……. ├── ……. └── Level_99.pkl

get_env_test_and_level(filename)[source]#
get_next_command()[source]#

A helper function to obtain the next command, which transparently also deals with things like unpacking of the command from the packed message, and consider the timeouts, etc when trying to fetch a new command.

get_redis_connection()[source]#

Obtains a new redis connection from a previously instantiated redis connection pool

handle_aicrowd_error_event(payload)[source]#
handle_aicrowd_info_event(payload)[source]#
handle_aicrowd_success_event(payload)[source]#
handle_env_create(command)[source]#

Handles a ENV_CREATE command from the client

handle_env_step(command)[source]#

Handles a ENV_STEP command from the client TODO: Add a high level summary of everything thats happening here.

handle_env_submit(command)[source]#

Handles a ENV_SUBMIT command from the client TODO: Add a high level summary of everything thats happening here.

handle_ping(command)[source]#

Handles PING command from the client.

instantiate_evaluation_metadata()[source]#

This instantiates a pandas dataframe to record information specific to each of the individual env evaluations.

This loads the template CSV with pre-filled information from the provided metadata.csv file, and fills it up with evaluation runtime information.

instantiate_redis_connection_pool()[source]#

Instantiates a Redis connection pool which can be used to communicate with the message broker

report_error(error_message, command_response_channel)[source]#

A helper function used to report error back to the client

run()[source]#

Main runner function which waits for commands from the client and acts accordingly.

save_actions()[source]#
save_analysis_data()[source]#
save_episode()[source]#
save_merged_env()[source]#
send_error(error_dict, suppress_logs=False)[source]#

For out-of-band errors like timeouts, where we do not have a command, so we have no response channel!

send_response(_command_response, command, suppress_logs=False)[source]#
update_evaluation_metadata()[source]#

This function is called when we move from one simulation to another and it simply tries to update the simulation specific information for the previous episode in the metadata_df if it exists.

update_running_stats(key, scalar)[source]#

Computes the running min/mean/max for given param