autojob.advance package¶

Execute semi-automatic workflow progression.

Warning

This subpackage is actively under development and subject to breaking API changes.

Submodules¶

autojob.advance.advance module¶

Semi-automatically advance workflows.

Examples

Programmatically,

from pathlib import Path

from autojob.advance.advance import advance

advance(dir_name=Path.cwd())

From the command-line,

autojob advance
autojob.advance.advance.add_item_to_parent(item_id: str, metadata_file: Path, key: Literal['Jobs', 'Calculations']) None[source]¶

Add the given ID to the details.json of its parent.

Parameters:
  • item_id – The ID to add.

  • metadata_file – The path to the metadata file of the parent to which to add the item ID.

  • key – The key to which to add. Either "Jobs" or "Calculations".

autojob.advance.advance.advance(*, dir_name: Path, file_size_limit: float = 100000000.0, submit: bool = True, archive_mode: Literal['json', 'None'], legacy_mode: bool = False) list[tuple[Task, Path]][source]¶

Advance to the next task in the workflow.

Parameters:
  • dir_name – The directory of the completed calculation.

  • file_size_limit – A float specifying the threshold above which files of this size will be deleted. Defaults to FILE_SIZE_LIMIT.

  • submit – Whether or not to submit the new job after creation. Defaults to True.

  • archive_mode – How to store the results.

  • legacy_mode – Whether or not to use the legacy directory structure. Additional features of legacy mode include: 1) tasks have a non None calculation ID, 2) task_id has the form r”j[A-Za-z0-9]{9}”

Returns:

A list of tuples (task_i, path_i) where task_i is the ith created Task and path_i is the Path representing the directory containing the ith created Task.

autojob.advance.advance.archive_task(dst: Path, task: Task, archive_mode: Literal['json'], study_dir: Path | None) Path[source]¶
autojob.advance.advance.archive_task(dst: Path, task: Task, archive_mode: Literal['None'], study_dir: Path | None) None

Archive a completed Task and note its completion in the study record.

Parameters:
  • dst – A Path object indicating in which directory to archive the Task

  • task – The Task to archive

  • archive_mode – The mode to archive the Task. “json” archives the Task as a .json file. “None” does not archive the Task.

  • study_dir – The root directory of the study to which task belongs. If None, then the task won’t be recorded in the study record.

Returns:

A Path representing the filename in which the completed task is dumped, if archive_mode = "json". Otherwise, None.

autojob.advance.advance.delete_large_files(old_job: Path, *, file_size_limit: float = 100000000.0, files_to_delete: list[str] | None = None) None[source]¶

Deletes large files from copied job.

Parameters:
  • old_job – A pathlib.Path object representing the directory holding the large files to be deleted.

  • file_size_limit – A float specifying the file size in bytes over which files will be deleted. Defaults to FILE_SIZE_LIMIT.

  • files_to_delete – A list of strings specifying files to delete. Defaults to an empty list.

autojob.advance.advance.get_next_steps(task: Task, study_dir: Path, *, restart: bool = False) list[str][source]¶

Get the UUIDs of the next steps in the workflow.

Parameters:
  • task – The previous task.

  • study_dir – The root directory of the study containing the completed task.

  • restart – Whether the task must be restarted. Defaults to False.

Returns:

A list of strings representing the steps that should be started since task has completed. If the task is to be restarted, the list will only contain a single string: the workflow step ID of the previous task.

autojob.advance.advance.populate_new_task_tree(*, previous_task_src: Path, new_task_dest: Path, new_task: Task, files_to_carry_over: list[str], legacy_mode: bool = False, is_restart: bool = False) None[source]¶

Populate the directory tree of a new task.

This function will copy over files to carry over, write task metadata files (e.g., job.json and calculation.json) as well as copy the directories that are staged in a temporary directory.

Parameters:
  • previous_task_src – A Path object representing the directory of the completed Task.

  • new_task_dest – A Path object representing the destination directory of the new Task.

  • new_task – The new Task.

  • files_to_carry_over – A list of strings indicating the files to carry over from the previous Task.

  • legacy_mode – Whether or not to use the legacy directory structure. Additional features of legacy mode include: 1) tasks have a non None calculation ID, 2) task_id has the form r”j[A-Za-z0-9]{9}”

  • is_restart – Whether or not the new task is a restart.

autojob.advance.advance.setup_task(*, task_type_spec: Annotated[type[Task], ImportString], parametrization: list[VariableReference[Any]], previous_task: Task, legacy_mode: bool = False, is_restart: bool = True) Task[source]¶

Setup a new Task according to a parametrization.

Parameters:
  • src – The source directory for the new Task.

  • task_type_spec – A string representing the fully qualified class name of the type of Task to be created.

  • parametrization – The Parametrization for the new Task. Note that the metadata of the new Task will also be newly set regardless of the parametrization.

  • previous_task – The previous Task.

  • legacy_mode – Whether or not to use the legacy directory structure. Additional features of legacy mode include: 1) tasks have a non None calculation ID, 2) task_id has the form r”j[A-Za-z0-9]{9}”

  • is_restart – Whether the task must be restarted. Defaults to False.

Returns:

The new Task instance.

autojob.advance.advance.submit_new_task(new_task: Path) None[source]¶

Submit the newly created job to the Slurm scheduler.

Parameters:

new_task – A Path to the new task’s directory.

autojob.advance.advance.update_metadata_file(new_task: Path, study_dir: Path, *, legacy_mode: bool = False, restart: bool = False)[source]¶

Update the metadata files for a newly created Task.

Parameters:
  • new_task – A Path representing the directory of the newly created Task in its final destination.

  • study_dir – The root directory of the study to which task belongs.

  • legacy_mode – Whether or not to use the legacy directory structure. Additional features of legacy mode include: 1) tasks have a non None calculation ID, 2) task_id has the form r”j[A-Za-z0-9]{9}”

  • restart – Whether the metadata is for a completed task. Defaults to True.

autojob.advance.advance.update_task_metadata(task_shell: dict[str, Any], task_type: str, *, context: dict[str, Any], legacy_mode: bool = False) None[source]¶

Update the task metadata for a Task shell.

This method modifies task_shell in-place. Specifically, this function sets the keys “study_group_id” and “study_id” to be the same as in context and creates a new “task_id”. The “tags” and “calculation_id” keys may also be set.

Parameters:
  • task_shell – A Task shell containing the key, task_metadata, which maps to a dictionary equivalent to what would be obtained with Task.create_shell().model_dump(exclude_none=True).

  • task_type – The class name of the type of Task to be created.

  • context – A dictionary containing a dumped model of the completed Task. The dictionary must have the key, “task_metadata”, which maps to a dictionary containing the keys “study_group_id” and “study_id”.

  • legacy_mode – Whether or not to use the legacy directory structure. Additional features of legacy mode include: 1) tasks have a non None calculation ID, 2) task_id has the form r”j[A-Za-z0-9]{9}”