autojob.utils package

Miscellaneous autojob utility functions.

autojob.utils.alphanum_key(val: str) tuple[str, bool | float | int | str | None][source]

Provides key to alphanumerically sort primitive types.

Parameters:

val – String representation of object for which to provide key.

Raises:

TypeError – The type of ‘val’ is invalid.

Returns:

A 2-tuple where the first element is a string indicating the type of the value (e.g., n = number, b = boolean, N = None, s = string) and the second element is the value.

Note

Numbers are converted to floats.

autojob.utils.alphanum_sort(vals: Iterable[str]) list[str][source]

Alphanumerically sorts an iterable of strings.

Parameters:

vals – an iterable to be sorted.

Returns:

Alphanumerically sorted copy of vals.

autojob.utils.get_slurm_job_id(job_dir: Path) int[source]

Returns the SLURM job id for the job run in the directory “job_dir”.

Parameters:

job_dir – The directory containing the slurm output file.

Raises:

FileNotFoundError – SLURM output file not found.

Returns:

The SLURM job id.

autojob.utils.get_uri(dir_name: str | Path) str[source]

Return the URI path for a directory.

This allows files hosted on different file servers to have distinct locations.

Adapted from Atomate2.

Arg:

dir_name: A directory name.

Returns:

Full URI path, e.g., “fileserver.host.com – /full/path/of/dir_name”.

autojob.utils.iter_to_native(vals: Iterable[float | int | str | None]) Iterable[float | int | str | None][source]

Converts values within an Iterable to their native types.

Parameters:

vals – an iterable of values to convert.

Returns:

Iterable – A shallow copy of the converted iterable.

Example

>>> from autojob.utils import iter_to_native
>>> iter_to_native(["0.1", "None", "-1", "dog"])
[0.1, None, -1, 'dog']
autojob.utils.parse_job_error(slurm_file: Path) JobError | None[source]

Parse the reason for job termination from the slurm script.

Parameters:

slurm_file – A Path pointing to the slurm script.

Returns:

A JobError corresponding to the reason for job termination, otherwise None.

autojob.utils.parse_job_stats_file(stats_file: Path) dict[str, float | int | str][source]

Parse information from a job stats file into a dictionary.

Parameters:

stats_file – Path to jobstats.txt file.

Raises:

ValueError – Missing headers in job stats file or extra headers found.

Returns:

The parsed job stats dictionary.

Note that no validation/conversion is done to the field values. Conversion to valid (more useful) Python values can be performed using SchedulerOutputs.model_validate.

autojob.utils.reduce_sparse_vector(vector: Iterable[_T]) _T[source]

Returns the first value in the sparse vector.

Parameters:

vector – An iterable.

Raises:

ValueError – The vector is empty.

autojob.utils.val_to_native(val: float | int | str | None) bool | float | int | str | None[source]

Converts string representations to their native types.

Only floats, ints, or strings are supported.

Parameters:

val – a value to be converted.

Returns:

The value converted into a PRIMITIVE_TYPE.

autojob.utils.vectorize_underscored_data(rows: list[str]) tuple[list[str], list[str]][source]

Turns rows of underscored data into columns.

An example of supported data is that which is returned by the SLURM command sacct:

Partition     MaxRSS   NNodes               Start
--------- ---------- -------- -------------------
     razi                   1 2022-07-29T09:48:15
           18049744K        1 2022-07-29T09:48:15
                   0        1 2022-07-29T09:48:15
Parameters:

rows – A list of strings read from a file containing the output from a Slurm job stats file or sacct.

Returns:

Vectorized job stats are returned as a tuple (headers, columns) where headers is a list of strings representing the headers used in the job stats file and columns is a list of lists of strings representing the remaining entries in the column. The header delimiters are excluded.

Submodules

autojob.utils.cli module

Utilities for CLI functions.

class autojob.utils.cli.MemoryFloat[source]

Bases: ParamType

A float representing an amount of memory.

convert(value: str | float, param, ctx) float[source]

Convert a memory specification into bytes.

name: ClassVar[str] = 'memory float'

the descriptive name of this type

autojob.utils.cli.configure_settings(config: dict[str, Any]) None[source]

Set redefine autojob settings.

Parameters:

config – A dictionary mapping autojob settings names to their desired values.

autojob.utils.cli.construct_cli_call(allowed: list[str] | None = None) str[source]

Construct the original CLI call.

Parameters:

allowed – A list of strings indicating which parameters are to be considered to reconstruct the CLI call.

Returns:

A string representing the command-line call that would produce the present behaviour.

autojob.utils.cli.mods_to_dict(_: Any, param: str, value: Iterable[str]) dict[str, Any][source]

Convert an iterable of key-value pairs to a dictionary.

Parameters:
  • _ – The first argument is ignored but retained for click compatibility.

  • param – The name of the parameter being set (e.g., calc_mods or slurm_mods).

  • value – An iterable of key-value pairs should exist as a string in the form “key=value”. Note that only those values supported by ~validation.val_to_native can be correctly parsed.

Returns:

A dictionary mapping calculator parameter names to their Python values.

autojob.utils.files module

Utilities for handling files and directories.

autojob.utils.files.check_job_status(job_id: int) str[source]

Determine the status of a SLURM job.

Parameters:

job_id – The Slurm job ID.

Returns:

A string indicating the job status.

autojob.utils.files.create_job_stats_file(slurm_job_id: int, job_dir: str | Path) Path[source]

Creates file containing statistics from completed Slurm job.

Parameters:
  • slurm_job_id – The Slurm job ID for the job.

  • job_dir – The job directory.

Raises:

RuntimeError – Unable to create job stats file.

Returns:

A pathlib.Path to the file containing the job statistics.

autojob.utils.files.extract_structure_name(python_script: TextIO) str[source]

Determine the structure filename from a Python script file.

The structure must appear in a call to ase.io.read as either:
  1. ase.io.read(structure_name)

  2. io.read(structure_name)

  3. read(structure_name)

The structure present in the first such occurrence will be returned.

Parameters:

python_script – A stream containing the contents of the Python script used to run the calculation.

Raises:

RuntimeError – No structure name found.

Returns:

A string representing the filename of the structure read in the Python script.

autojob.utils.files.find_calculation_dirs(path: Path | None = None) list[Path][source]

Find all calculation directories in the directory tree below “path”.

Note that if a path matches the specified pattern, its subdirectories are not searched.

Parameters:

path – Top level directory to be searched. Defaults to current working directory.

Returns:

A list of Paths to all calculation directories below path.

autojob.utils.files.find_finished_jobs(path: Path | None = None) list[Path][source]

Find the directories and subdirectories containing finished jobs.

These jobs may have terminated due to errors, but they are no longer running.

Parameters:

path – The directory in which to search. Defaults to None (in which case the current working directory is searched).

Returns:

A list of Paths pointing to directories containing jobs that have finished.

autojob.utils.files.find_job_dirs(path: Path | None = None) list[Path][source]

Find all job directories in the directory tree below “path”.

Note that if a path matches the specified pattern, its subdirectories are not searched.

Parameters:

path – Top level directory to be searched. Defaults to current working directory.

Returns:

A list of all job directories below path.

autojob.utils.files.find_last_submitted_jobs(path: Path | None = None, ignore_unrun_jobs: bool = False) list[Path][source]

Returns the directories of the most recently submitted jobs.

Only the directories in each calculation specified in “path” or subdirectories of “path” are returned.

Parameters:
  • path – The directory specifying or containing calculations. Defaults to current working directory.

  • ignore_unrun_jobs – If true, no job will be reported for calculation directories containing jobs that have yet been run. Otherwise, the most recently submitted job will be reported. Defaults to False.

Returns:

A list of Paths to directories containing newest jobs for each calculation in path or subdirectories of path.

autojob.utils.files.find_slurm_file(dir_name: Path) Path[source]

Retrieves the path to the first slurm output file found.

Parameters:

dir_name – The directory in which to search.

Returns:

The path to the slurm output file. If multiple slurm output files exist, the one corresponding to the job with the highest slurm job ID will be returned.

Raises:

FileNotFoundError – No valid slurm file found.

autojob.utils.files.find_study_dirs(path: Path | None = None) list[Path][source]

Find all study directories in the directory tree below “path”.

Note that if a path matches the specified pattern, its subdirectories are not searched.

Parameters:

path – Top level directory to be searched. Defaults to current working directory.

Returns:

A list of Paths to all study directories below path.

autojob.utils.files.find_study_group_dirs(path: Path | None = None) list[Path][source]

Find all study group directories in the directory tree below “path”.

Note that if a path matches the specified pattern, its subdirectories are not searched.

Parameters:
  • path – Top level directory to be searched. Defaults to

  • current working directory.

Returns:

List[pathlib.Path] – All study group directories below “path”.

autojob.utils.files.get_loader() BaseLoader[source]

Return the Jinja template loader.

autojob.utils.files.get_slurm_job_id(job_dir: Path) int[source]

Returns the SLURM job id for the job run in the directory “job_dir”.

Parameters:

job_dir – The directory containing the slurm output file.

Raises:

FileNotFoundError – SLURM output file not found.

Returns:

The SLURM job id.

autojob.utils.files.get_uri(dir_name: str | Path) str[source]

Return the URI path for a directory.

This allows files hosted on different file servers to have distinct locations.

Adapted from Atomate2.

Arg:

dir_name: A directory name.

Returns:

Full URI path, e.g., “fileserver.host.com – /full/path/of/dir_name”.

autojob.utils.parsing module

Utilities for parsing data.

class autojob.utils.parsing.TimedeltaTuple(days: int = 0, hours: int = 0, minutes: int = 0, seconds: int = 0)[source]

Bases: NamedTuple

Convenience wrapper around a timedelta object.

Create new instance of TimedeltaTuple(days, hours, minutes, seconds)

days: int

Alias for field number 0

static format_time(time_denomination: int) str[source]

Format time into a 0-padded integer.

classmethod from_slurm_time(time: str) TimedeltaTuple[source]

Parses a valid slurm time value into a TimedeltaTuple.

The six formats accepted by Slurm are:

1: minutes

2: minutes:seconds

3: hours:minutes:seconds

4: days-hours

5: days-hours:minutes

6: days-hours:minutes:seconds

Parameters:

time – the string containing the value of the –time slurm option

Raises:

ValueError – The string is not a valid value of the slurm –time option. See https://slurm.schedmd.com/sbatch.html for details.

Returns:

A TimedeltaTuple.

classmethod from_string(string: str, time_format: Literal['iso', 'slurm'] = 'slurm') TimedeltaTuple[source]

Return a TimedeltaTuple from a string.

Parameters:
  • string – the time string to parse.

  • time_format – One of “iso” or “slurm”. Determines how the time string is parsed.

Returns:

A TimedeltaTuple.

classmethod from_timedelta(delta: timedelta) TimedeltaTuple[source]

Break a timedelta instance into days, hours, minutes, and seconds.

Parameters:

delta – a timedelta instance.

Returns:

A 4-tuple of ints – days, hours, minutes, seconds

hours: int

Alias for field number 1

minutes: int

Alias for field number 2

seconds: int

Alias for field number 3

to_slurm_time() str[source]

Convert TimedeltaTuple into a SLURM-compatible time format.

to_timedelta() timedelta[source]

Convert a TimedeltaTuple to a timedelta instance.

autojob.utils.parsing.extract_keyword_arguments(keywords: list[keyword], code: Module) dict[str, Any][source]

Extract the runtime values of the keyword arguments to a function.

Parameters:
  • keywords – A list of keyword objects.

  • code – The module object of which the keywords are descendents.

Returns:

A dictionary mapping the keywords to the runtime values.

autojob.utils.parsing.import_class(class_string: Annotated[_T, ImportString]) _T[source]

Import a class using its fully qualified name.

Parameters:

class_string – The fully qualified name of the class. For example, autojob.hpc.SchedulerInputs.

Returns:

The class.

autojob.utils.parsing.parse_job_error(slurm_file: Path) JobError | None[source]

Parse the reason for job termination from the slurm script.

Parameters:

slurm_file – A Path pointing to the slurm script.

Returns:

A JobError corresponding to the reason for job termination, otherwise None.

autojob.utils.parsing.parse_job_stats_file(stats_file: Path) dict[str, float | int | str][source]

Parse information from a job stats file into a dictionary.

Parameters:

stats_file – Path to jobstats.txt file.

Raises:

ValueError – Missing headers in job stats file or extra headers found.

Returns:

The parsed job stats dictionary.

Note that no validation/conversion is done to the field values. Conversion to valid (more useful) Python values can be performed using SchedulerOutputs.model_validate.

autojob.utils.parsing.reduce_sparse_vector(vector: Iterable[_T]) _T[source]

Returns the first value in the sparse vector.

Parameters:

vector – An iterable.

Raises:

ValueError – The vector is empty.

autojob.utils.parsing.vectorize_underscored_data(rows: list[str]) tuple[list[str], list[str]][source]

Turns rows of underscored data into columns.

An example of supported data is that which is returned by the SLURM command sacct:

Partition     MaxRSS   NNodes               Start
--------- ---------- -------- -------------------
     razi                   1 2022-07-29T09:48:15
           18049744K        1 2022-07-29T09:48:15
                   0        1 2022-07-29T09:48:15
Parameters:

rows – A list of strings read from a file containing the output from a Slurm job stats file or sacct.

Returns:

Vectorized job stats are returned as a tuple (headers, columns) where headers is a list of strings representing the headers used in the job stats file and columns is a list of lists of strings representing the remaining entries in the column. The header delimiters are excluded.

autojob.utils.schemas module

Pydantic schema utilities.

class autojob.utils.schemas.AtomsAnnotation[source]

Bases: BaseModel

The Pydantic-compatible annotation for an ase.atoms.Atoms object.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

model_computed_fields: ClassVar[Dict[str, ComputedFieldInfo]] = {}

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

autojob.utils.schemas.atoms_as_dict(s: Atoms) dict[source]

Represent an ase.atoms.Atoms object as a dictionary.

autojob.utils.schemas.atoms_from_dict(d: dict) Atoms[source]

Instantiate an ase.atoms.Atoms object from a dictionary.

autojob.utils.schemas.hyphenate(v: str) str[source]

Replace underscores with hyphens.

autojob.utils.schemas.space_capitalize(v: str) str[source]

Replace underscores with spaces and capitalize each word.

autojob.utils.templates module

Utilities for templating strings.

autojob.utils.templates.substitute_placeholders(templated_value: str, /, **kwargs) str[source]

Subtitute values for placeholders.

Parameters:
  • templated_value – The templated value.

  • **kwargs – Each keyword should be a valid Python identifier and the corresponding value is its replacement. Keywords will be converted by replacing hyphens with underscores.

Returns:

The original string with placeholders substituted.

Example:

>>> import pathlib
>>> from autojob.next.relaxation import _substitute_placeholders

>>> _substitute_placeholders(
        "This is the job id: %{job-id}",
        job_id="j123456789",
    )
This is the job id: j123456789