Serialization

This section discusses the two forms of “serialization” which apply to autojob TaskBases. By serialization, we mean

  1. the act of translating a TaskBase into a format that can be easily stored in files, memory buffers, or transmitted across network (this is what is traditionally meant by serialization) as well as

  2. the act of storing a task in files itself.

This page describes how both functions are achieved within autojob.

Converting a TaskBase to JSON

Because all TaskBases are Pydantic BaseModels, the traditional function of serialization is achieved simply via the BaseModel.model_dump() method.

from pathlib import Path

from autojob.tasks.task import task

task = Task()
dumped_task = task.model_dump()

To obtain JSON-serializable output, pass mode="json" (the result can then immediately be dumped to a file as usual.)

from pathlib import Path

from autojob.tasks.task import task

task = Task()
dumped_task = task.model_dump(mode="json")

with Path("my_task.json").open(mode="w", encoding="utf-8") as file:
    json.dump(dumped_task, file, indent=4)

To exclude all None values, pass exclude_none=True.

from pathlib import Path

from autojob.tasks.task import task

task = Task()
dumped_task = task.model_dump(exclude_none=True)

Writing the Inputs of Tasks to a Directory

An instance of a concrete implementation of TaskBase can also be “serialized to a directory”. This process serializes a task by writing its inputs and metadata to files. The behaviour is defined in the InputWriter interface, which implementations of TaskBase must adhere to by defining a write_inputs() method. Good implementations of this method must enable identical data to be loaded from the files written with write_inputs(). For example, Task.write_inputs() writes a JSON file (named by SETTINGS.INPUTS_FILE) containing the task inputs, a trajectory containing the input Atoms object(s), a metadata JSON, and a templated task script that is meant to be used to execute the task.

If one has already created a directory in which they would like to dump a task, dumping to a directory should be done like so:

from pathlib import Path

from ase.build import molecule

from autojob.tasks.task import Task, TaskInputs, TaskMetadata

atoms = molecule("CO")
metadata = TaskMetadata()
tinputs = TaskInputs(atoms=atoms, atoms_filename="in.traj")
task = Task(task_metadata=metadata, task_inputs=tinputs)
task_dir = Path("my_task_dir")
task_dir.mkdir()
task.write_inputs(task_dir)

Otherwise, it is recommended to create new task directories with autojob.next.create_task_tree(). This function can be used to create templated directories and copy files from the directory of a previous task.

from pathlib import Path

from ase.build import molecule

from autojob.next import create_task_tree
from autojob.tasks.task import Task, TaskInputs, TaskMetadata

atoms = molecule("CO")
metadata = TaskMetadata()
tinputs = TaskInputs(atoms=atoms, atoms_filename="in.traj")
task = Task(task_metadata=metadata, task_inputs=tinputs)
task_dir = Path()
src = Path("../calc_1")
to_copy = ["CHGCAR", "WAVECAR"]
template = "calc_{i}"
create_task_tree(
    task,
    task_dir,
    src=src,
    files_to_carryover=to_copy,
    name_template=template
)

Given a previously completed task in the directory ../calc_1, this snippet will create a new task directory as a sibling directory of the directory of the completed task and copy the CHGCAR and WAVECAR files (if present) into the new task directory.

A Note on Templates

In addition to a templated task script, Calculation.write_inputs() also writes a templated calculation script. Task and calculation script templates are defined by the TaskInputs.task_script_template and CalculationInputs.calculation_script_template attributes. When Task.write_inputs and Calculation.write_inputs are called, the strings indicated by these attributes are used to retrieve Jinja templates from the directory defined by SETTINGS.TEMPLATE_DIR

These attributes must point to Jinja templates. These templates are converted into scripts by calling the Template.render() method with all Task (or Calculation) attributes as keyword arguments. The values of all autojob settings at the time of templating are also passed to Template.render() under the keyword argument, settings.

The following snippet can be used to preview the result of a task script template:

from pathlib import Path
from autojob import SETTINGS
from autojob.tasks.task import Task

SETTINGS.TEMPLATE_DIR = Path("/path/to/templates")
tinputs = TaskInputs(task_script_template="my_template.sh.j2")
task = Task(task_inputs=tinputs)
dest = Path()
task.write_task_script(dest)

See also

For creating tasks within the workflow infrastructure, opt for the autojob.next.initialize_task() function.

autojob.tasks.calculation.Calculation.write_inputs()