API Reference¶
The public API surface is small; most users interact via the CLI. Importers/exporters live under wf2wf.importers
and wf2wf.exporters
.
.. py:module:: wf2wf.core
wf2wf.core – Intermediate Representation (IR) classes and helpers.
This module defines the canonical, engine-agnostic data structures that all importers must emit and all exporters must consume. Validation utilities and JSON/TOML (de)serialisers will be added in later iterations.
.. py:class:: BCOSpec(object_id: str | None = None, spec_version: str = ‘https://w3id.org/ieee/ieee-2791-schema/2791object.json’, etag: str | None = None, provenance_domain: ~typing.Dict[str, ~typing.Any] =
BioCompute Object specification for regulatory compliance.
.. py:attribute:: BCOSpec.description_domain :module: wf2wf.core :type: ~typing.Dict[str, ~typing.Any]
.. py:attribute:: BCOSpec.error_domain :module: wf2wf.core :type: ~typing.Dict[str, ~typing.Any]
.. py:attribute:: BCOSpec.etag :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: BCOSpec.execution_domain :module: wf2wf.core :type: ~typing.Dict[str, ~typing.Any]
.. py:attribute:: BCOSpec.extension_domain :module: wf2wf.core :type: ~typing.List[~typing.Dict[str, ~typing.Any]]
.. py:attribute:: BCOSpec.io_domain :module: wf2wf.core :type: ~typing.Dict[str, ~typing.Any]
.. py:attribute:: BCOSpec.object_id :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: BCOSpec.parametric_domain :module: wf2wf.core :type: ~typing.List[~typing.Dict[str, ~typing.Any]]
.. py:attribute:: BCOSpec.provenance_domain :module: wf2wf.core :type: ~typing.Dict[str, ~typing.Any]
.. py:attribute:: BCOSpec.spec_version :module: wf2wf.core :type: str :value: ‘https://w3id.org/ieee/ieee-2791-schema/2791object.json’
.. py:attribute:: BCOSpec.usability_domain :module: wf2wf.core :type: ~typing.List[str]
.. py:class:: DocumentationSpec(description: str | None = None, label: str | None = None, doc: str | None = None, intent: ~typing.List[str] =
Rich documentation for workflows and tasks.
.. py:attribute:: DocumentationSpec.description :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: DocumentationSpec.doc :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: DocumentationSpec.examples :module: wf2wf.core :type: ~typing.List[~typing.Dict[str, ~typing.Any]]
.. py:attribute:: DocumentationSpec.intent :module: wf2wf.core :type: ~typing.List[str]
.. py:attribute:: DocumentationSpec.label :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: DocumentationSpec.usage_notes :module: wf2wf.core :type: str | None :value: None
.. py:class:: Edge(parent: str, child: str) :module: wf2wf.core
Directed edge relating parent → child task.
.. py:attribute:: Edge.child :module: wf2wf.core :type: str
.. py:attribute:: Edge.parent :module: wf2wf.core :type: str
.. py:class:: EnvironmentSpec(conda: str | None = None, container: str | None = None, workdir: str | None = None, env_vars: ~typing.Dict[str, str] =
Execution environment definition.
.. py:attribute:: EnvironmentSpec.conda :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: EnvironmentSpec.container :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: EnvironmentSpec.env_vars :module: wf2wf.core :type: ~typing.Dict[str, str]
.. py:attribute:: EnvironmentSpec.modules :module: wf2wf.core :type: ~typing.List[str]
.. py:attribute:: EnvironmentSpec.workdir :module: wf2wf.core :type: str | None :value: None
.. py:class:: FileSpec(path: str, class_type: str = ‘File’, format: str | None = None, checksum: str | None = None, size: int | None = None, secondary_files: ~typing.List[str] =
Enhanced file specification with CWL features.
.. py:attribute:: FileSpec.basename :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: FileSpec.checksum :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: FileSpec.class_type :module: wf2wf.core :type: str :value: ‘File’
.. py:method:: FileSpec.compute_stats(*, read_contents: bool = False) -> None :module: wf2wf.core
Populate `checksum`, `size` and optionally `contents` if the path exists.
.. py:attribute:: FileSpec.contents :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: FileSpec.dirname :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: FileSpec.format :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: FileSpec.listing :module: wf2wf.core :type: ~typing.List[~wf2wf.core.FileSpec]
.. py:attribute:: FileSpec.nameext :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: FileSpec.nameroot :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: FileSpec.path :module: wf2wf.core :type: str
.. py:attribute:: FileSpec.secondary_files :module: wf2wf.core :type: ~typing.List[str]
.. py:attribute:: FileSpec.size :module: wf2wf.core :type: int | None :value: None
.. py:method:: FileSpec.validate() -> None :module: wf2wf.core
.. py:class:: ParameterSpec(id: str, type: str | ~wf2wf.core.TypeSpec, label: str | None = None, doc: str | None = None, default: ~typing.Any = None, format: str | None = None, secondary_files: ~typing.List[str] =
CWL v1.2.1 parameter specification for inputs and outputs.
.. py:attribute:: ParameterSpec.default :module: wf2wf.core :type: ~typing.Any :value: None
.. py:attribute:: ParameterSpec.doc :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: ParameterSpec.format :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: ParameterSpec.id :module: wf2wf.core :type: str
.. py:attribute:: ParameterSpec.input_binding :module: wf2wf.core :type: ~typing.Dict[str, ~typing.Any] | None :value: None
.. py:attribute:: ParameterSpec.label :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: ParameterSpec.load_contents :module: wf2wf.core :type: bool :value: False
.. py:attribute:: ParameterSpec.load_listing :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: ParameterSpec.output_binding :module: wf2wf.core :type: ~typing.Dict[str, ~typing.Any] | None :value: None
.. py:attribute:: ParameterSpec.secondary_files :module: wf2wf.core :type: ~typing.List[str]
.. py:attribute:: ParameterSpec.streamable :module: wf2wf.core :type: bool :value: False
.. py:attribute:: ParameterSpec.type :module: wf2wf.core :type: str | ~wf2wf.core.TypeSpec
.. py:method:: ParameterSpec.validate() -> None :module: wf2wf.core
.. py:attribute:: ParameterSpec.value_from :module: wf2wf.core :type: str | None :value: None
.. py:class:: ProvenanceSpec(authors: ~typing.List[~typing.Dict[str, str]] =
Provenance and authorship information for workflows and tasks.
.. py:attribute:: ProvenanceSpec.authors :module: wf2wf.core :type: ~typing.List[~typing.Dict[str, str]]
.. py:attribute:: ProvenanceSpec.citations :module: wf2wf.core :type: ~typing.List[str]
.. py:attribute:: ProvenanceSpec.contributors :module: wf2wf.core :type: ~typing.List[~typing.Dict[str, str]]
.. py:attribute:: ProvenanceSpec.created :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: ProvenanceSpec.derived_from :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: ProvenanceSpec.doi :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: ProvenanceSpec.extras :module: wf2wf.core :type: ~typing.Dict[str, ~typing.Any]
.. py:attribute:: ProvenanceSpec.keywords :module: wf2wf.core :type: ~typing.List[str]
.. py:attribute:: ProvenanceSpec.license :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: ProvenanceSpec.modified :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: ProvenanceSpec.version :module: wf2wf.core :type: str | None :value: None
.. py:class:: RequirementSpec(class_name: str, data: ~typing.Dict[str, ~typing.Any] =
CWL requirement or hint specification.
.. py:attribute:: RequirementSpec.class_name :module: wf2wf.core :type: str
.. py:attribute:: RequirementSpec.data :module: wf2wf.core :type: ~typing.Dict[str, ~typing.Any]
.. py:method:: RequirementSpec.validate() -> None :module: wf2wf.core
.. py:class:: ResourceSpec(cpu: int = 1, mem_mb: int = 0, disk_mb: int = 0, gpu: int = 0, gpu_mem_mb: int = 0, time_s: int = 0, threads: int = 1, extra: ~typing.Dict[str, ~typing.Any] =
Normalised resource keys common across engines.
Units: * memory -> MB (int) * disk -> MB (int) * time -> seconds (int) * gpu_mem -> MB (int) per GPU
.. py:attribute:: ResourceSpec.cpu :module: wf2wf.core :type: int :value: 1
.. py:attribute:: ResourceSpec.disk_mb :module: wf2wf.core :type: int :value: 0
.. py:attribute:: ResourceSpec.extra :module: wf2wf.core :type: ~typing.Dict[str, ~typing.Any]
.. py:attribute:: ResourceSpec.gpu :module: wf2wf.core :type: int :value: 0
.. py:attribute:: ResourceSpec.gpu_mem_mb :module: wf2wf.core :type: int :value: 0
.. py:attribute:: ResourceSpec.mem_mb :module: wf2wf.core :type: int :value: 0
.. py:attribute:: ResourceSpec.threads :module: wf2wf.core :type: int :value: 1
.. py:attribute:: ResourceSpec.time_s :module: wf2wf.core :type: int :value: 0
.. py:class:: ScatterSpec(scatter: ~typing.List[str], scatter_method: str = ‘dotproduct’) :module: wf2wf.core
Scatter operation specification for parallel execution.
.. py:attribute:: ScatterSpec.scatter :module: wf2wf.core :type: ~typing.List[str]
.. py:attribute:: ScatterSpec.scatter_method :module: wf2wf.core :type: str :value: ‘dotproduct’
.. py:class:: Task(id: str, label: str | None = None, doc: str | None = None, command: str | None = None, script: str | None = None, inputs: ~typing.List[~wf2wf.core.ParameterSpec] =
A single executable node in the workflow DAG with enhanced CWL support.
.. py:attribute:: Task.command :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: Task.doc :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: Task.documentation :module: wf2wf.core :type: ~wf2wf.core.DocumentationSpec | None :value: None
.. py:attribute:: Task.environment :module: wf2wf.core :type: ~wf2wf.core.EnvironmentSpec
.. py:attribute:: Task.hints :module: wf2wf.core :type: ~typing.List[~wf2wf.core.RequirementSpec]
.. py:attribute:: Task.id :module: wf2wf.core :type: str
.. py:attribute:: Task.inputs :module: wf2wf.core :type: ~typing.List[~wf2wf.core.ParameterSpec]
.. py:attribute:: Task.intent :module: wf2wf.core :type: ~typing.List[str]
.. py:method:: Task.is_active(context: ~typing.Dict[str, ~typing.Any] | None = None) -> bool :module: wf2wf.core
Evaluate the *when* expression (if any) against *context* variables.
.. py:attribute:: Task.label :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: Task.meta :module: wf2wf.core :type: ~typing.Dict[str, ~typing.Any]
.. py:attribute:: Task.outputs :module: wf2wf.core :type: ~typing.List[~wf2wf.core.ParameterSpec]
.. py:attribute:: Task.params :module: wf2wf.core :type: ~typing.Dict[str, ~typing.Any]
.. py:attribute:: Task.priority :module: wf2wf.core :type: int :value: 0
.. py:attribute:: Task.provenance :module: wf2wf.core :type: ~wf2wf.core.ProvenanceSpec | None :value: None
.. py:attribute:: Task.requirements :module: wf2wf.core :type: ~typing.List[~wf2wf.core.RequirementSpec]
.. py:attribute:: Task.resources :module: wf2wf.core :type: ~wf2wf.core.ResourceSpec
.. py:attribute:: Task.retry :module: wf2wf.core :type: int :value: 0
.. py:attribute:: Task.scatter :module: wf2wf.core :type: ~wf2wf.core.ScatterSpec | None :value: None
.. py:method:: Task.scatter_bindings(runtime_inputs: ~typing.Dict[str, ~typing.Any]) -> ~typing.List[~typing.Dict[str, ~typing.Any]] :module: wf2wf.core
Return a list of variable bindings for each scatter shard.
If *scatter* is not defined this returns a single binding (empty dict).
.. py:attribute:: Task.script :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: Task.when :module: wf2wf.core :type: str | None :value: None
.. py:class:: TypeSpec(type: str, items: str | ~wf2wf.core.TypeSpec | None = None, fields: ~typing.Dict[str, ~wf2wf.core.TypeSpec] =
CWL v1.2.1 type specification with advanced features.
.. py:attribute:: TypeSpec.default :module: wf2wf.core :type: ~typing.Any :value: None
.. py:attribute:: TypeSpec.fields :module: wf2wf.core :type: ~typing.Dict[str, ~wf2wf.core.TypeSpec]
.. py:attribute:: TypeSpec.items :module: wf2wf.core :type: str | ~wf2wf.core.TypeSpec | None :value: None
.. py:attribute:: TypeSpec.members :module: wf2wf.core :type: ~typing.List[~wf2wf.core.TypeSpec]
.. py:attribute:: TypeSpec.name :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: TypeSpec.nullable :module: wf2wf.core :type: bool :value: False
.. py:method:: TypeSpec.parse(obj: str | ~wf2wf.core.TypeSpec | ~typing.Dict[str, ~typing.Any]) -> ~wf2wf.core.TypeSpec :module: wf2wf.core :classmethod:
Return a :class:`TypeSpec` instance from *obj*.
Accepts CWL‐style shorthand strings such as ``File``, ``string?`` (nullable),
``File[]`` (array of File), or fully fledged mapping objects produced by
``cwltool --print-pre``. If *obj* is already a :class:`TypeSpec*`` it is
returned unchanged.
.. py:attribute:: TypeSpec.symbols :module: wf2wf.core :type: ~typing.List[str]
.. py:attribute:: TypeSpec.type :module: wf2wf.core :type: str
.. py:method:: TypeSpec.validate() -> None :module: wf2wf.core
Semantic validation for the CWL type system.
:raises ValueError: If the type definition is semantically invalid.
.. py:class:: Workflow(name: str, version: str = ‘1.0’, label: str | None = None, doc: str | None = None, tasks: ~typing.Dict[str, ~wf2wf.core.Task] =
A collection of Tasks plus dependency edges and optional metadata with enhanced CWL/BCO support.
.. py:method:: Workflow.add_edge(parent: str, child: str) :module: wf2wf.core
.. py:method:: Workflow.add_task(task: ~wf2wf.core.Task) :module: wf2wf.core
.. py:attribute:: Workflow.bco_spec :module: wf2wf.core :type: ~wf2wf.core.BCOSpec | None :value: None
.. py:attribute:: Workflow.config :module: wf2wf.core :type: ~typing.Dict[str, ~typing.Any]
.. py:attribute:: Workflow.cwl_version :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: Workflow.doc :module: wf2wf.core :type: str | None :value: None
.. py:attribute:: Workflow.documentation :module: wf2wf.core :type: ~wf2wf.core.DocumentationSpec | None :value: None
.. py:attribute:: Workflow.edges :module: wf2wf.core :type: ~typing.List[~wf2wf.core.Edge]
.. py:method:: Workflow.from_dict(data: ~typing.Dict[str, ~typing.Any]) -> ~wf2wf.core.Workflow :module: wf2wf.core :classmethod:
Re-hydrate from `json.load(...)` result (best-effort).
.. py:method:: Workflow.from_json(json_str: str) -> ~wf2wf.core.Workflow :module: wf2wf.core :classmethod:
Re-hydrate from JSON string produced by :py:meth:`to_json`.
.. py:attribute:: Workflow.hints :module: wf2wf.core :type: ~typing.List[~wf2wf.core.RequirementSpec]
.. py:attribute:: Workflow.inputs :module: wf2wf.core :type: ~typing.List[~wf2wf.core.ParameterSpec]
.. py:attribute:: Workflow.intent :module: wf2wf.core :type: ~typing.List[str]
.. py:attribute:: Workflow.label :module: wf2wf.core :type: str | None :value: None
.. py:method:: Workflow.load_json(path: str | ~pathlib.Path) :module: wf2wf.core :classmethod:
Load Workflow from a JSON file produced by :py:meth:`save_json`.
.. py:attribute:: Workflow.loss_map :module: wf2wf.core :type: ~typing.List[~typing.Dict[str, ~typing.Any]]
.. py:attribute:: Workflow.meta :module: wf2wf.core :type: ~typing.Dict[str, ~typing.Any]
.. py:attribute:: Workflow.name :module: wf2wf.core :type: str
.. py:attribute:: Workflow.outputs :module: wf2wf.core :type: ~typing.List[~wf2wf.core.ParameterSpec]
.. py:attribute:: Workflow.provenance :module: wf2wf.core :type: ~wf2wf.core.ProvenanceSpec | None :value: None
.. py:attribute:: Workflow.requirements :module: wf2wf.core :type: ~typing.List[~wf2wf.core.RequirementSpec]
.. py:method:: Workflow.save_json(path: str | ~pathlib.Path, *, indent: int = 2) :module: wf2wf.core
Write JSON representation to *path* (creates parent dirs).
.. py:attribute:: Workflow.tasks :module: wf2wf.core :type: ~typing.Dict[str, ~wf2wf.core.Task]
.. py:method:: Workflow.to_dict() -> ~typing.Dict[str, ~typing.Any] :module: wf2wf.core
Return a plain-Python representation ready for JSON/TOML dump.
.. py:method:: Workflow.to_json(*, indent: int = 2) -> str :module: wf2wf.core
.. py:method:: Workflow.validate() -> None :module: wf2wf.core
Run JSON-Schema plus semantic validation checks.
:raises ValueError or jsonschema.ValidationError if the workflow is invalid.:
.. py:attribute:: Workflow.version :module: wf2wf.core :type: str :value: ‘1.0’
.. py:module:: wf2wf.environ
wf2wf.environ – environment-build helpers (Phase 2)
This initial slice implements §9.2.1-9.2.2 of the design draft: • Generate a deterministic lock hash from a Conda YAML file. • Create a relocatable tarball (stand-in for conda-pack) so downstream exporters can reference a stable artefact even where Conda tooling is unavailable in the test environment.
Real micromamba/conda-pack execution will be wired in later; for now we simulate the build while preserving the critical interface and metadata.
.. py:class:: BuildahBuilder(*, tool: str | None = None, dry_run: bool = True) :module: wf2wf.environ
Wrapper around buildah / podman build for sites that prefer it.
.. py:method:: BuildahBuilder.build(tarball: ~pathlib.Path, tag: str, labels: ~typing.Dict[str, str] | None = None, *, push: bool = False, build_cache: str | None = None, platform: str = ‘linux/amd64’) -> str :module: wf2wf.environ
Build image, optionally push, and return digest (sha256:...).
.. py:class:: DockerBuildxBuilder(*, dry_run: bool = True) :module: wf2wf.environ
Tiny wrapper around docker buildx build
(dry-run by default).
.. py:method:: DockerBuildxBuilder.build(tarball: ~pathlib.Path, tag: str, labels: ~typing.Dict[str, str] | None = None, *, push: bool = False, build_cache: str | None = None, platform: str = ‘linux/amd64’) -> str :module: wf2wf.environ
Build image, optionally push, and return digest (sha256:...).
.. py:class:: OCIBuilder() :module: wf2wf.environ
Protocol-like base class for OCI builders.
.. py:method:: OCIBuilder.build(tarball: ~pathlib.Path, tag: str, labels: ~typing.Dict[str, str] | None = None, *, push: bool = False, build_cache: str | None = None) -> str :module: wf2wf.environ
Build image, optionally push, and return digest (sha256:...).
.. py:function:: build_oci_image(tarball: ~pathlib.Path, *, tag_prefix: str = ‘wf2wf/env’, backend: str = ‘buildx’, push: bool = False, platform: str = ‘linux/amd64’, build_cache: str | None = None, dry_run: bool = True) -> tuple[str, str] :module: wf2wf.environ
High-level helper that picks a builder backend and returns (tag, digest).
.. py:function:: build_or_reuse_env_image(env_yaml: str | ~pathlib.Path, *, registry: str | None = None, push: bool = False, backend: str = ‘buildx’, dry_run: bool = True, build_cache: str | None = None, cache_dir: ~pathlib.Path | None = None) -> ~typing.Dict[str, str] :module: wf2wf.environ
High-level helper: build image for env_yaml unless identical hash already indexed.
Returns dict with keys tag
and digest
.
.. py:function:: convert_to_sif(image_ref: str, *, sif_dir: ~pathlib.Path | None = None, dry_run: bool = True) -> ~pathlib.Path :module: wf2wf.environ
Convert OCI image_ref to Apptainer SIF file.
Uses spython
if available; otherwise simulates by touching a file.
.. py:function:: generate_lock_hash(env_yaml: ~pathlib.Path) -> str :module: wf2wf.environ
Return sha256 digest hex string of the Conda YAML env_yaml.
The digest is calculated over the normalised file contents (strip CRLF, remove comment lines), ensuring platform-independent hashes.
.. py:function:: generate_sbom(image_ref: str, out_dir: ~pathlib.Path | None = None, *, dry_run: bool = True) -> ~wf2wf.environ.SBOMInfo :module: wf2wf.environ
Generate an SPDX SBOM for image_ref and return :class:SBOMInfo
.
In dry-run mode (the default during unit tests) this creates a minimal JSON file containing the image reference and a fake package list.
.. py:function:: prepare_env(env_yaml: str | ~pathlib.Path, *, cache_dir: ~pathlib.Path | None = None, verbose: bool = False, dry_run: bool | None = None) -> ~wf2wf.environ.EnvBuildResult :module: wf2wf.environ
Simulate environment build pipeline and return artefact locations.
Compute lock hash from env_yaml.
Copy YAML to a content-addressed location
<hash>.yaml
inside cache_dir.Create a tar.gz containing the YAML as a placeholder for a conda-pack archive and place it next to the lock file.
The function is idempotent: repeated calls with the same YAML content return the same paths without rebuilding.
.. py:function:: prune_cache(*, days: int = 60, min_free_gb: int = 5, verbose: bool = False) :module: wf2wf.environ
Remove cache entries older than days if disk free space below threshold.
Very lightweight implementation; only checks tarballs & SIF files.