autojob.advance package¶
Execute semi-automatic workflow progression.
Warning
This subpackage is actively under development and subject to breaking API changes.
Submodules¶
autojob.advance.advance module¶
Semi-automatically advance workflows.
Examples
Programmatically,
from pathlib import Path
from autojob.advance.advance import advance
advance(dir_name=Path.cwd())
From the command-line,
autojob advance
- autojob.advance.advance.add_item_to_parent(item_id: str, metadata_file: Path, key: Literal['Jobs', 'Calculations']) None[source]¶
Add the given ID to the details.json of its parent.
- Parameters:
item_id – The ID to add.
metadata_file – The path to the metadata file of the parent to which to add the item ID.
key – The key to which to add. Either
"Jobs"or"Calculations".
- autojob.advance.advance.advance(*, dir_name: Path, file_size_limit: float = 100000000.0, submit: bool = True, archive_mode: Literal['json', 'None'], legacy_mode: bool = False) list[tuple[Task, Path]][source]¶
Advance to the next task in the workflow.
- Parameters:
dir_name – The directory of the completed calculation.
file_size_limit – A float specifying the threshold above which files of this size will be deleted. Defaults to FILE_SIZE_LIMIT.
submit – Whether or not to submit the new job after creation. Defaults to True.
archive_mode – How to store the results.
legacy_mode – Whether or not to use the legacy directory structure. Additional features of legacy mode include: 1) tasks have a non None calculation ID, 2) task_id has the form r”j[A-Za-z0-9]{9}”
- Returns:
A list of tuples (task_i, path_i) where task_i is the ith created Task and path_i is the Path representing the directory containing the ith created Task.
- autojob.advance.advance.archive_task(dst: Path, task: Task, archive_mode: Literal['json'], study_dir: Path | None) Path[source]¶
- autojob.advance.advance.archive_task(dst: Path, task: Task, archive_mode: Literal['None'], study_dir: Path | None) None
Archive a completed Task and note its completion in the study record.
- Parameters:
dst – A Path object indicating in which directory to archive the Task
task – The Task to archive
archive_mode – The mode to archive the Task. “json” archives the Task as a .json file. “None” does not archive the Task.
study_dir – The root directory of the study to which task belongs. If None, then the task won’t be recorded in the study record.
- Returns:
A Path representing the filename in which the completed task is dumped, if
archive_mode = "json". Otherwise, None.
- autojob.advance.advance.delete_large_files(old_job: Path, *, file_size_limit: float = 100000000.0, files_to_delete: list[str] | None = None) None[source]¶
Deletes large files from copied job.
- Parameters:
old_job – A pathlib.Path object representing the directory holding the large files to be deleted.
file_size_limit – A float specifying the file size in bytes over which files will be deleted. Defaults to FILE_SIZE_LIMIT.
files_to_delete – A list of strings specifying files to delete. Defaults to an empty list.
- autojob.advance.advance.get_next_steps(task: Task, study_dir: Path, *, restart: bool = False) list[str][source]¶
Get the UUIDs of the next steps in the workflow.
- Parameters:
task – The previous task.
study_dir – The root directory of the study containing the completed task.
restart – Whether the task must be restarted. Defaults to False.
- Returns:
A list of strings representing the steps that should be started since task has completed. If the task is to be restarted, the list will only contain a single string: the workflow step ID of the previous task.
- autojob.advance.advance.populate_new_task_tree(*, previous_task_src: Path, new_task_dest: Path, new_task: Task, files_to_carry_over: list[str], legacy_mode: bool = False, is_restart: bool = False) None[source]¶
Populate the directory tree of a new task.
This function will copy over files to carry over, write task metadata files (e.g., job.json and calculation.json) as well as copy the directories that are staged in a temporary directory.
- Parameters:
previous_task_src – A Path object representing the directory of the completed Task.
new_task_dest – A Path object representing the destination directory of the new Task.
new_task – The new Task.
files_to_carry_over – A list of strings indicating the files to carry over from the previous Task.
legacy_mode – Whether or not to use the legacy directory structure. Additional features of legacy mode include: 1) tasks have a non None calculation ID, 2) task_id has the form r”j[A-Za-z0-9]{9}”
is_restart – Whether or not the new task is a restart.
- autojob.advance.advance.setup_task(*, task_type_spec: Annotated[type[Task], ImportString], parametrization: list[VariableReference[Any]], previous_task: Task, legacy_mode: bool = False, is_restart: bool = True) Task[source]¶
Setup a new Task according to a parametrization.
- Parameters:
src – The source directory for the new Task.
task_type_spec – A string representing the fully qualified class name of the type of Task to be created.
parametrization – The Parametrization for the new Task. Note that the metadata of the new Task will also be newly set regardless of the parametrization.
previous_task – The previous Task.
legacy_mode – Whether or not to use the legacy directory structure. Additional features of legacy mode include: 1) tasks have a non None calculation ID, 2) task_id has the form r”j[A-Za-z0-9]{9}”
is_restart – Whether the task must be restarted. Defaults to False.
- Returns:
The new Task instance.
- autojob.advance.advance.submit_new_task(new_task: Path) None[source]¶
Submit the newly created job to the Slurm scheduler.
- Parameters:
new_task – A Path to the new task’s directory.
- autojob.advance.advance.update_metadata_file(new_task: Path, study_dir: Path, *, legacy_mode: bool = False, restart: bool = False)[source]¶
Update the metadata files for a newly created Task.
- Parameters:
new_task – A Path representing the directory of the newly created Task in its final destination.
study_dir – The root directory of the study to which task belongs.
legacy_mode – Whether or not to use the legacy directory structure. Additional features of legacy mode include: 1) tasks have a non None calculation ID, 2) task_id has the form r”j[A-Za-z0-9]{9}”
restart – Whether the metadata is for a completed task. Defaults to True.
- autojob.advance.advance.update_task_metadata(task_shell: dict[str, Any], task_type: str, *, context: dict[str, Any], legacy_mode: bool = False) None[source]¶
Update the task metadata for a Task shell.
This method modifies task_shell in-place. Specifically, this function sets the keys “study_group_id” and “study_id” to be the same as in context and creates a new “task_id”. The “tags” and “calculation_id” keys may also be set.
- Parameters:
task_shell – A Task shell containing the key, task_metadata, which maps to a dictionary equivalent to what would be obtained with Task.create_shell().model_dump(exclude_none=True).
task_type – The class name of the type of Task to be created.
context – A dictionary containing a dumped model of the completed Task. The dictionary must have the key, “task_metadata”, which maps to a dictionary containing the keys “study_group_id” and “study_id”.
legacy_mode – Whether or not to use the legacy directory structure. Additional features of legacy mode include: 1) tasks have a non None calculation ID, 2) task_id has the form r”j[A-Za-z0-9]{9}”