How autojob structures directories ๐ญยถ
Because of autojobโs committment to file-based persistence, it imposes certain
requirements on the directory structures that it is works with. autojob can support
structured and unstructured directories. The type of directory that autojob
finds changes how tasks are harvested and whether workflows are run.
Structured Directoriesยถ
The first and more complete directory structure is outlined below:
study_group/
โโ study_group.json
โโ study/
โโ study.json
โโ parameterizations.json
โโ record.txt
โโ workflow.json
โโ task_group/
โโ task_group.json
โโ task/
โโ task.json
โโ inputs.json
Study groups are confined to a directory. As you can see, this directory
structure mirrors the data hierarchy intrinsic
to autojob. Structured directories are created by
StudyGroup.to_directory() and Study.to_directory() and populated
with functions such as create_task_group_tree(), create_task_tree(),
and Task.write_inputs(). The purpose of each of the files above is outlined
below:
study_group.jsonA JSON dictionary containing study group metadata such as the study group ID, the study IDs of its constituent studies, and the name of the study group. The file is a JSON-serialized version of an instance of
StudyGroupwhose creation can be replicated with:import json from pathlib import Path from autojob import SETTINGS from autojob.study_group import StudyGroup sg = StudyGroup() metadata_file = SETTINGS.STUDY_GROUP_METADATA_FILE with Path(metadata_file).open(mode="w", encoding="utf-8") as f: json.dump(sg.model_dump(), f)
See also
study.jsonA JSON dictionary containing study metadata such as the study group ID, the study ID, the task group IDs of its constituent task groups, and the name of the study. The file is a JSON-serialized version of an instance of
Studywhose creation can be replicated with:import json from pathlib import Path from autojob import SETTINGS from autojob.study import Study study = Study() metadata_file = SETTINGS.STUDY_METADATA_FILE with Path(metadata_file).open(mode="w", encoding="utf-8") as f: json.dump(study.model_dump(), f)
See also
parameterizations.jsona JSON dictionary mapping a workflow step ID to a
Step(not implemented)record.txta text file in which each line lists a task ID of a completed task (not implemented)
workflow.jsona dictionary mapping a workflow step ID to a list of workflow step IDs; a directed acyclic graph representing the studyโs workflow (not implemented)
task_group.jsona JSON dictionary containing task group metadata such as the study ID, the task group IDs of its constituent task groups, and the name of the task group. The file is a JSON-serialized version of an instance of
TaskMetadataBasewith only those fields intask_base.TASK_GROUP_FIELDSpresent. In addition, the dictionary contains a"tasks"key that lists the tasks that are part of the task group. Creation of this file can be replicated with:import json from pathlib import Path from autojob import SETTINGS from autojob.bases.task_base import TASK_GROUP_FIELDS from autojob.tasks.task import TaskMetadata metadata = TaskMetadata() metadata_file = SETTINGS.TASK_GROUP_METADATA_FILE with Path(metadata_file).open(mode="w", encoding="utf-8") as f: json.dump(metadata.model_dump(include=TASK_GROUP_FIELDS), f)
task.jsonA JSON dictionary containing task metadata such as the study group ID, the study ID, the task group ID, the task ID, and the name of the task. The file is a JSON-serialized version of an instance of
TaskMetadataBasethat is written whenTask.write_inputs()is called. This file can be created with:import json from pathlib import Path from autojob import SETTINGS from autojob.tasks.task import TaskMetadata metadata = TaskMetadata() metadata_file = SETTINGS.TASK_GROUP_METADATA_FILE with Path(metadata_file).open(mode="w", encoding="utf-8") as f: json.dump(metadata.model_dump(), f)
This file is read when loading tasks from a directory and is used to determine the type of task to load and construct the
TaskMetadataBasefor the task.See also
inputs.jsona JSON dictionary containing the task inputs. Exactly which keys appear in this file may differ depending on the type of task the inputs were created for. This file is written when
Task.write_inputs()is called and can be created with:import json from pathlib import Path from autojob import SETTINGS from autojob.tasks.task import TaskInputs inputs = TaskInputs() metadata_file = SETTINGS.INPUTS_FILE with Path(metadata_file).open(mode="w", encoding="utf-8") as f: json.dump(inputs.model_dump(), f)
This file is read when loading tasks from a directory and is used to determine the inputs of a task and construct the
TaskInputsBasefor the task.See also
Unstructured Directoriesยถ
Alternatively, if the required files are not found, then autojob can function
in an unstructured mode. In this mode, metadata is only maintained for tasks. If
task metadata files are missing, they are created. Tasks can still be harvested
and restarted. But there is no support for running workflows in unstructured mode.
This mode can be useful for quick scratchwork that still leverages metadata tracking,
data harvesting, or task infrastructure.
To support this mode of use, autojob provides the CLI tool autojob new that
can be used to quickly clone tasks from directories and create unstructured task
directories.
Other Common Filesยถ
Other common files that autojob creates are:
archive.jsonThis is the default filename for saving
autojobarchives.run.pyThis is the default filename for calculation scripts that are used by
Calculation.run.shThis is the default filename for task scripts that are used by
Task.