Task Creation

The abstract base class, TaskBase, defines these methods as abstract methods that must be implemented by its concrete implementations. TaskBase objects cannot be directly instantiated. The class merely specifies an interface for concrete implementations. Each concrete implementation of TaskBase represents a type of task, and each instance represents an atomic compute job.

autojob.tasks is a namespace package that contains several concrete implementations of TaskBase that can be used as is or subclassed and modified further. In the following example, Calculation is a concrete implementation of TaskBase made for storing the results of a calculation and calc represents an individual calculation.

from autojob.tasks.calculation import Calculation, CalculationInputs

calc = Calculation()

Note that Calculation is a subclass of the Task class. It is recommended that users implementing custom tasks subclass one of these two classes (or their subclasses).

Instances of either class can be customized at creation or afterwards. The simplest task (e.g., an instance of Task) has attributes representing task inputs, outputs, and metadata. This information is defined in the TaskMetadataBase, the TaskInputsBase, and the TaskOutputsBase classes, respectively.

The state of a task is represented by Pydantic model fields. This means that upon creation, all fields will be validated and converted to an expected type; missing fields will be instantiated with default values. For example, the task_metadata field of tasks expects an instance of TaskMetadataBase (which is actually an abstract base class with corresponding concrete implementation TaskMetadata). If you pass an incompatible type, instantiation of the task will fail.

from autojob.tasks.task import Task

task = Task(task_metadata=1)

However, if you pass a dictionary that can be converted into a valid TaskMetadataBase object, then validation will pass.

from autojob.tasks.task import Task

task = Task(
    task_metadata={
        "label": "my task",
        "study_group_id": "g123456789",
        "study_id": "s123456789",
        "task_group_id": "c123456789",
        "task_id": "j123456789",
    },
)

Of course, you can always pass an instance of the expected type.

from autojob.tasks.task import Task, TaskMetadata

task = Task(
    task_metadata=TaskMetadata(
    label="my task",
    study_group_id="g123456789",
    study_id="s123456789",
    task_group_id="c123456789",
    task_id="j123456789",
    ),
)

Note that validation is only done upon instantiation. Task fields are not validated during assignment operations. Therefore, the recommended way to modify tasks and their fields is to instantiate the desired object prior to assignment.

from autojob.tasks.task import Task, TaskMetadata

task = Task()
task.task_metadata = TaskMetadata(
    label="my task",
    study_group_id="g123456789",
    study_id="s123456789",
    task_group_id="c123456789",
    task_id="j123456789",
)

Users are encouraged to consult the TaskBase documentation for a description of the remaining fields of all tasks and the Calculation source code for a description of the additional fields of Calculation objects.

Note that the latter five implementations inherit from subclasses of Calculation.

Further, users can define their own implementations of TaskBase by inheriting from one of the abstract base classes or from one of the concrete implementations defined herein. Although passing custom implementations of tasks will work for all methods which accept TaskBase instances, in order for some of the magic of harvest to function correctly (automatic instantiating of task subclasses) without explicitly passing the custom class, users should register their tasks using the plug-in system (TODO: write how-to).