autojob.next package

Utilities for creating tasks from existing task directories.

autojob.next.add_item_to_parent(item_id: str, metadata_file: Path, key: Literal['tasks', 'task_groups']) None[source]

Add the given ID to the details.json of its parent.

Parameters:
  • item_id – The ID to add.

  • metadata_file – The path to the metadata file of the parent to which to add the item ID.

  • key – The key to which to add. Either "tasks" or "task_groups".

autojob.next.carry_over_files(*, previous_task_src: Path, new_task_dest: Path, new_task: TaskBase, files_to_carry_over: list[str] | None = None) None[source]

Copy files from a previous task directory to a new task directory.

Parameters:
  • previous_task_src – A Path object representing the directory of the completed task.

  • new_task_dest – A Path object representing the destination directory of the new task.

  • new_task – The new task.

  • files_to_carry_over – A list of strings indicating the files to carry over from the previous task. Defaults to an empty list.

autojob.next.clean_up_task(old_job: Path, *, file_size_limit: float = 100000000.0, files_to_delete: list[str] | None = None) None[source]

Deletes large files from copied job.

Parameters:
  • old_job – A Path object representing the directory holding the large files to be deleted.

  • file_size_limit – A float specifying the file size in bytes over which files will be deleted. Defaults to FILE_SIZE_LIMIT.

  • files_to_delete – A list of strings specifying files to delete. Defaults to an empty list.

autojob.next.create_next_step(*, src: Path, step: Step, previous_task: TaskBase, file_size_limit: float = 100000000.0, submit: bool = True, restart: bool = False, name_template: str | None = None) list[tuple[TaskBase, Path]][source]

Initiate a step by creating all tasks that are ready to start.

Parameters:
  • src – The source directory for the new tasks. That is, the directory containing the recently completed task.

  • step – The Step to initiate.

  • previous_task – The previous task.

  • file_size_limit – A float specifying the threshold above which files of this size will be deleted from the source directory. Defaults to FILE_SIZE_LIMIT.

  • submit – Whether or not to submit the new TaskBases after creation. Defaults to True.

  • restart – Whether or not the task to be created is a restart of a previous task. Defaults to False.

  • name_template – A template to use for the directory name. Defaults to None in which case the task ID will be used.

Returns:

A list of 2-tuples (task, path) where task is the new TaskBase instance and path is the path in which it was dumped. For new task groups, path will point to the new task group directory.

autojob.next.create_task_group_tree(task: TaskBase, dest: Path, *, src: Path | None = None, name_template: str | None = None) Path[source]

Create a new task group directory.

In addition to directory creation, this method will create a task group metadata file and copy directory permissions and ownership to the new directory.

Parameters:
  • task – The new task for which the task group directory will be made.

  • dest – The directory in which the new task group directory will be created.

  • src – The directory of the completed task. Defaults to None in which case permissions and ownership are not set.

  • name_template – A template to use for the directory name. Defaults to None in which case the task group ID will be used.

Raises:

ValueError – Cannot create new task group without task group ID.

Returns:

The path to the newly created task group directory.

autojob.next.create_task_tree(task: TaskBase, dest: Path, *, src: Path | None = None, files_to_carry_over: list[str] | None = None, name_template: str | None = None) Path[source]

Create a new task directory.

Parameters:
  • task – The new task for which the directory will be made.

  • dest – A Path representing the directory in which to create the directory of the new task.

  • src – The source directory for the new task. Defaults to None in which case no files will be carried over.

  • files_to_carry_over – A list of strings indicating the files to carry over from the previous task. Defaults to None.

  • name_template – A template to use for the directory name. Defaults to None in which case the task ID will be used.

Returns:

The path to the newly created task directory.

autojob.next.finalize_task(src: Path, task: TaskBase, record_task: bool = False) None[source]

Archive a completed task and note its completion in the study record.

Parameters:
  • src – A Path object indicating in which directory to archive the task.

  • task – The task to finalize.

  • record_task – Whether or not to record the completion of the task in the study record. Defaults to False.

autojob.next.initialize_task(*, task_class: str, parametrization: list[VariableReference[Any]], previous_task: TaskBase, restart: bool = False) TaskBase[source]

Setup a new task according to a parametrization.

Parameters:
  • task_class – A string representing the fully qualified class name of the type of task to be created.

  • parametrization – The parametrization for the new task.

  • previous_task – The previous task.

  • restart – Whether or not the task to be created is a restart of a previous task. Defaults to False.

Note

parametrization must specify all parameters which are to be inherited from previous_task. Any parameters that are not set by parametrization will assume their default values.

Returns:

The new TaskBase instance.

autojob.next.submit_new_task(new_task: Path) None[source]

Submit the newly created job to the Slurm scheduler.

Parameters:

new_task – A Path to the new task’s directory.

autojob.next.substitute_context(mods: dict[str, Any], context: dict[str, Any]) dict[str, Any][source]

Substitute context values into formatted strings.

Parameters:
  • mods – A dictionary mapping parameter names to values. String values will be subsituted according to context values.

  • context – A dictionary mapping variable names to their values. Variables with names corresponding to template names will be substituted.

Returns:

A copy of mods with templated values substituted for their variable values.

Submodules

autojob.next.advance module

Semi-automatically advance workflows.

Examples

Programmatically,

from pathlib import Path

from autojob.advance.advance import advance

advance(dir_name=Path.cwd())

From the command-line,

autojob advance
autojob.next.advance.advance(*, src: Path, file_size_limit: float = 100000000.0, submit: bool = True) list[tuple[TaskBase, Path]][source]

Advance to the next task in the workflow.

Parameters:
  • src – The directory of the completed calculation.

  • file_size_limit – A float specifying the threshold above which files of this size will be deleted. Defaults to FILE_SIZE_LIMIT.

  • submit – Whether or not to submit the new job after creation. Defaults to True.

Raises:

RuntimeError – Task failed! Cannot advance to next step!

Returns:

A list of tuples (task_i, path_i) where task_i is the ith created task and path_i is the Path representing the directory containing the ith created task.

autojob.next.advance.get_next_steps(task: TaskBase, study_dir: Path) list[str][source]

Get the UUIDs of the next steps in the workflow.

Parameters:
  • task – The previous task.

  • study_dir – The root directory of the study containing the completed task.

Returns:

A list of strings representing the steps that should be started since task has completed. If the task is to be restarted, the list will only contain a single string: the workflow step ID of the previous task.

autojob.next.restart module

Restart a completed task.

Examples

Programmatically,

from pathlib import Path

from autojob.next.restart import restart

restart(dir_name=Path.cwd())

From the command-line,

autojob restart
autojob.next.restart.restart(src: str | Path | None = None, *, calc_mods: dict[str, Any] | None = None, sched_mods: dict[str, Any] | None = None, file_size_limit: float = 100000000.0, submit: bool = True, auto_restart: bool = False, files_to_carry_over: Iterable[str] | None = None) tuple[TaskBase, Path][source]

Advance to the next task in the workflow.

Parameters:
  • src – The directory of the completed task. Defaults to the current working directory.

  • calc_mods – A dictionary mapping calculator parameters to values that should be used to overwrite the existing parameters.

  • sched_mods – A dictionary mapping Slurm options to values that should be used to overwrite the existing parameters.

  • file_size_limit – A float specifying the threshold above which files of this size will be deleted. Defaults to FILE_SIZE_LIMIT.

  • submit – Whether or not to submit the new job after creation. Defaults to True.

  • auto_restart – Whether or not to add logic to automatically restart the calculation after the calculation has converged.

  • files_to_carry_over – A list of strings indicating which files to carry over from the old job directory to the new job directory. Defaults to None, in which case, the files to copy are determined from the previous task.

Returns:

A list of tuples (task_i, path_i) where task_i is the ith created task and path_i is the Path representing the directory containing the ith created task.

Warning

When specifying sched_mods, be wary of setting mutually exclusive scheduler parameters (e.g, mem and mem_per_cpu or cores and cores_per_node). For example, if the mem parameter is set and one wants to set the mem_per_cpu parameter, set the mem key to Unset in sched_mods in addition to setting the mem_per_cpu key.