Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collect use cases for datalad-remake (and underlying tooling) #3

Open
mih opened this issue Apr 29, 2024 · 0 comments
Open

Collect use cases for datalad-remake (and underlying tooling) #3

mih opened this issue Apr 29, 2024 · 0 comments

Comments

@mih
Copy link
Member

mih commented Apr 29, 2024

The remake effort aims to serve a few general use cases, and also to yield tools that can serve all of them with maximum alignment with other existing solutions or development. These general use cases are:

  • provenance capture of programmatic dataset modifications (i.e., the domain of datalad run)
  • re-execution of provenance records, for the purpose of
    • verifying reproducibility (i.e., datalad rerun)
    • re-applying computational steps on different data (i.e., datalad rerun --onto)
  • output extraction after execution of (parametric) compute instructions (i.e., "compute for get" special remote)
  • depositing compute instructions for "prospective outputs" (never computed/recorded)

A list of more concrete use cases will help to inform both design and presentation (documentation, paper) of the implementation. Here is a (growing) collection for consideration as documentation example, or use case featured prominently in the paper:

  • fmriprep: compute large outputs, hash them, an rely on them being bit-identical reproducible to avoid storing them
  • provide data in alternative (file) formats (store CSV, provide XLSX on-demand)
  • render partial data for specific purposes (produce video clips from source video via a cutlist)
  • apply all edits to a RAW photo to render a JPEG on demand
@mih mih transferred this issue from another repository May 2, 2024
@mih mih changed the title Collect use cases for datalad-remake Collect use cases for datalad-remake (and underlying tooling) May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: discussion needed
Development

No branches or pull requests

1 participant