CWL-aligned design/implementation #14

mih · 2024-05-15T08:29:10Z

This issue replaces or concludes a number of previously expressed idea, aiming to reduce the complexity and make more obvious where we stand right now. Replaced are:

The present concept is to think of a recomputation as a three-step process, where each step can be represented as a node in a CWL workflow:

Provision: Establish the environment required for a computation
Compute: Run a computation
Extract: Pick relevant outputs and present them as outputs of the computation in a particular context

Each step needs critical information that must be stored and supplied. All steps also have different scopes:

Provision: the exact same parameterization can yield suitable inputs for more than one computation
Compute: the exact same compute specification can be combinable with a broad range of inputs and yield different outputs
Extract: One and the same compute output can be filtered in many ways to yield desired outputs in a particular context

The steps also have different applicability with respect to fixed or variable values for a particular recompute

Provision: exact for reproducing (I want the same) vs. variable for reevaluation (I want to see how different it is, e.g. datalad rerun --onto)
Compute: recompute exactly vs. recompute exactly with the new version of the tool
Extract: Mostly together with a change in the compute specification or implementation, output filters may need to be adjusted to continue to deliver the same output (name/location change)

Taken together these requirements determine where and how the parameters of all three steps can be stored, and, importantly, how they need to be referenced. In general this means that we would want to be able to identified all parameter sets, simultaneously, by precise version (exact parameters), and by concept (or latest version).

TODO:

anticipatory walkthrough for the use case "recompute git-annex key"

The text was updated successfully, but these errors were encountered:

This was referenced May 15, 2024

Investigate CWL for computing instructions #1

Closed

Build demo mapping of a datalad-run record as a CWL CommandLineTool #7

Closed

Design option: wrap datalad inside CWL #9

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CWL-aligned design/implementation #14

CWL-aligned design/implementation #14

mih commented May 15, 2024

CWL-aligned design/implementation #14

CWL-aligned design/implementation #14

Comments

mih commented May 15, 2024