Serialization¶
This section discusses the two forms of “serialization” which apply to autojob
TaskBases. By serialization, we mean
the act of translating a
TaskBaseinto a format that can be easily stored in files, memory buffers, or transmitted across network (this is what is traditionally meant by serialization) as well asthe act of storing a task in files itself.
This page describes how both functions are achieved within autojob.
Converting a TaskBase to JSON¶
Because all TaskBases are Pydantic
BaseModels, the traditional function of serialization is achieved simply via the
BaseModel.model_dump() method.
from pathlib import Path
from autojob.tasks.task import task
task = Task()
dumped_task = task.model_dump()
To obtain JSON-serializable output, pass mode="json" (the result can then
immediately be dumped to a file as usual.)
from pathlib import Path
from autojob.tasks.task import task
task = Task()
dumped_task = task.model_dump(mode="json")
with Path("my_task.json").open(mode="w", encoding="utf-8") as file:
json.dump(dumped_task, file, indent=4)
To exclude all None values, pass exclude_none=True.
from pathlib import Path
from autojob.tasks.task import task
task = Task()
dumped_task = task.model_dump(exclude_none=True)
Writing the Inputs of Tasks to a Directory¶
An instance of a concrete implementation of TaskBase can also be “serialized
to a directory”. This process serializes a task by writing its inputs and metadata to files.
The behaviour is defined in the InputWriter interface, which implementations of
TaskBase must adhere to by defining a write_inputs() method. Good implementations
of this method must enable identical data to be loaded from the files written with write_inputs().
For example, Task.write_inputs() writes a JSON file (named by SETTINGS.INPUTS_FILE)
containing the task inputs, a trajectory containing the input Atoms object(s),
a metadata JSON, and a templated task script that is meant to be used to execute the task.
If one has already created a directory in which they would like to dump a task, dumping to a directory should be done like so:
from pathlib import Path
from ase.build import molecule
from autojob.tasks.task import Task, TaskInputs, TaskMetadata
atoms = molecule("CO")
metadata = TaskMetadata()
tinputs = TaskInputs(atoms=atoms, atoms_filename="in.traj")
task = Task(task_metadata=metadata, task_inputs=tinputs)
task_dir = Path("my_task_dir")
task_dir.mkdir()
task.write_inputs(task_dir)
Otherwise, it is recommended to create new task directories with
autojob.next.create_task_tree(). This function can be used to create templated
directories and copy files from the directory of a previous task.
from pathlib import Path
from ase.build import molecule
from autojob.next import create_task_tree
from autojob.tasks.task import Task, TaskInputs, TaskMetadata
atoms = molecule("CO")
metadata = TaskMetadata()
tinputs = TaskInputs(atoms=atoms, atoms_filename="in.traj")
task = Task(task_metadata=metadata, task_inputs=tinputs)
task_dir = Path()
src = Path("../calc_1")
to_copy = ["CHGCAR", "WAVECAR"]
template = "calc_{i}"
create_task_tree(
task,
task_dir,
src=src,
files_to_carryover=to_copy,
name_template=template
)
Given a previously completed task in the directory ../calc_1, this snippet
will create a new task directory as a sibling directory of the directory of the
completed task and copy the CHGCAR and WAVECAR files (if present) into
the new task directory.
A Note on Templates¶
In addition to a templated task script, Calculation.write_inputs() also writes a
templated calculation script. Task and calculation script templates are defined by the
TaskInputs.task_script_template and
CalculationInputs.calculation_script_template attributes. When Task.write_inputs
and Calculation.write_inputs are called, the strings indicated by these attributes
are used to retrieve Jinja templates from the directory defined by SETTINGS.TEMPLATE_DIR
These attributes must
point to Jinja templates. These templates are converted into scripts by calling the
Template.render() method with all Task (or Calculation)
attributes as keyword arguments. The values of all autojob settings at the time of
templating are also passed to Template.render() under the keyword argument,
settings.
The following snippet can be used to preview the result of a task script template:
from pathlib import Path
from autojob import SETTINGS
from autojob.tasks.task import Task
SETTINGS.TEMPLATE_DIR = Path("/path/to/templates")
tinputs = TaskInputs(task_script_template="my_template.sh.j2")
task = Task(task_inputs=tinputs)
dest = Path()
task.write_task_script(dest)
See also
For creating tasks within the workflow infrastructure, opt for the
autojob.next.initialize_task() function.