How atomic compute jobs are represented and organized¶
The Task class that autojob defines represents an
atomic compute job. A task has inputs and outputs as well as metadata which are
stored as Task attributes. Importantly, autojob also
defines two import subclasses, Action (WIP) and
Calculation. An Action is a
procedure that does not require heavy computational resources. Examples of
procedures that may be represented as actions are defected structure generation,
adsorbate placement, or slab creation. Alternatively, a Calculation does
require heavier computational resources and may include submission to a job
scheduler. Like the name suggests, examples of Calculation instances include
various calculations (e.g., DFT, CCSD, QM/MM).
More facts about using Tasks:
In addition to task inputs/outputs and metadata,
Calculationinstances also have calculation inputs and outputs defined.Task inputs can be written to a directory with the
Task.to_directory()method.Tasks results can be retrieved with the
Task.from_directory()method.Studies can be constructed from several
Taskinstances.Study groups can be constructed from several
Studyinstances.You can write input directories for all tasks of a study group using the
StudyGroup.to_directory()method.
So, for example, you could retrieve the results of a calculation directory,
from autojob.calculation.calculation import Calculation
task = Calculation.from_directory(Path())
and then use the output structure from the task to generate a new set of tasks to submit.
from pathlib import Path
from shortuuid import uuid
from autojob.study import Study
from autojob.study_group import StudyGroup
functionals = ["PBEPBE", "PBE1PBE", "wB97xD", "B3LYP"]
calculations = []
study_group_id = "g" + uuid()[:9]
study_id = "s" + uuid()[:9]
# modify parameters and metadata
for functional in functionals:
new_calc = task.copy()
# copies output atoms to input atoms
new_calc.prepare_inputs_atoms()
new_calc.calculation_inputs.parameters["xc"] = functional
new_calc.task_metadata.study_group_id = study_group_id
new_calc.task_metadata.study_id = study_id
new_calc.task_metadata.calculation_id = "c" + uuid()[:9]
new_calc.task_metadata.task_id = "j" + uuid()[:9]
atoms = calc.task_inputs.atoms
assert atoms is not None # noqa: S101
structure = atoms.info["structure"]
calculation_id = calc.task_metadata.calculation_id
new_calc.scheduler_inputs.job_name = (
f"{structure}-{calculation_id}"
)
calculations.append(new_calc)
# create study
study = Study(
name="Study",
tasks=calculations,
study_id=study_id,
)
# create studiy group
study_group = StudyGroup(
name="Study Group",
studies=[study],
study_group_id=study_group_id,
)
# Write input directories
study_group_dir = Path(study_group_id)
study_group_dir.mkdir()
study_group.to_directory(Path())
Useful Definitions¶
- Study
A study is a collection of workflows.
- Workflow
A workflow is a directed acyclic graph of actions and tasks.
- Action
An action is a locally run step in a workflow such as determining all non-equivalent adsorption sites on a metal surface or permuting a defect within a structure.
- Calculation
A calculation is an atomic compute job that may be submitted to a scheduler. Calculations often require parallelization and submission to a workload manager such as Slurm. Examples of tasks include single-point calculations, relaxation calculations, and ab-initio molecular dynamics calculations.
- Task
A task is essentially the intersection of an action and a calculation. A task can be thought of as a general step in a workflow.