Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic generation of extra-fields template forms from yaml-based schema (BIDS standard) #5023

Open
SylvainTakerkart opened this issue Mar 28, 2024 · 6 comments

Comments

@SylvainTakerkart
Copy link

Describe your feature request precisely

Hi,

We're wondering whether it would be possible to automatically generate template forms for experiments from a standard that is specified through a set of yaml files...

Indeed, in neuroscience, we're particularly interested in the BIDS data organization standard, which is now (almost fully) available in a machine-readable way through a set of yaml files (https://bids-specification.readthedocs.io/en/stable/schema/index.html). Being able to generate such forms automatically would allow the standardized gathering of all metadata necessary to construct a BIDS structure for organizing and storing the data+metadata in a standard way. And since BIDS is about to support most data modalities used in modern neuroscience, this would be a great addition for the entire neuroscience community...

As a side note, we've been working on generating such metadata-collection forms through the definition of extra-fields in eLab's experiments for some time now. We've worked manually to do so, for some specific data modalities, interacting with neuroscience researchers who perform the data acquisition. We're obviously seeing the limits of such manual approach, and are therefore seeking for an automated way to do so, based on the BIDS standard specifications.

Pro Support

yes

@NicolasCARPi
Copy link
Contributor

Hello Sylvain,

Thank you for opening this issue.

I've looked into the BIDS standard in the past weeks, and from what I understand, it is a way to organize files inside folders, and these folders can have yaml file to hold metadata.

generate template forms for experiments from a standard that is specified through a set of yaml files...

There is something I do not understand in your workflow: you generate a template from the yaml so the template can generate yaml? So initial YAML would provide keys, and user would provide values in eLab, and then the yaml files gets updated again by eLab? It's not clear.

Usually, who/what is generating this folder structure, is it a human or an application or both? How are the YAML files generated right now, do people create them by hand?

When the ability to add an external (human-readable) folder is added (see #4748 ), we could think about labelling this external folder as "BIDS standardized content", and then we could look at what could be done to extract the information contained in the YAML files scattered around this folder structure.

Side note: it would have been so much easier if BIDS would be RO-Crate instead, with one single JSON-LD metadata file describing everything in the folder structure ;)

Side note2: could it be interesting that some python app would crawl a BIDS directory structure and transform it all in .eln? Then this file could be imported to eLab, which already understand eln file format... just thinking out loud...

@NicolasCARPi NicolasCARPi changed the title Automatic generation of extra-fields template forms from yaml-based schema Automatic generation of extra-fields template forms from yaml-based schema (BIDS standard) Apr 1, 2024
@SylvainTakerkart
Copy link
Author

SylvainTakerkart commented Apr 2, 2024

Hi! I'm tagging @yarikoptic , who's more BIDS-proficient than me ;)

First, a small correction, from your sentence: BIDS is a way to organize files inside folders, and these folders can have json and/or tsv files to hold metadata. (not yaml)

Here, we're trying to work on a software tool that is aimed at generating a BIDS structure, early on after data acquisition... In summary, this consists in: i) converting the data file(s) from vendor-specific format(s) into a standard data format supported in BIDS (there is a list, that depends on the data modality), ii) gathering all the metadata necessary for BIDS (and more if possible, some metadata fields being tagged only as recommended or optional for BIDS, some others being required), iii) creating the directory structure and all the files with the correct names.

For step ii), the metadata can be spread out in different places: 1. in the data file itself (in the vendor-specific format), 2. in other files generated by the data acquisition setup, 3. in some logging system used by the experimenter DURING data acquisition (or right after). Our goal here is to use an electronic lab book to standardize the gathering of the metadata of type 3.... And we would like to make sure that everything needed to generate the BIDS structure is actually present in the form that we would ask the experimenter to fill in eLab, which is why an automatic generation of such form from the BIDS specifications would be great...

Does this make more sense now?

@yarikoptic
Copy link
Contributor

Side note: it would have been so much easier if BIDS would be RO-Crate instead, with one single JSON-LD metadata file describing everything in the folder structure ;)

FWIW, there is a "single JSON file" (albeit not jsonld anyhow) "read in" of the entire BIDS schema .yaml files hierarchy available in https://bids-specification.readthedocs.io/en/stable/schema.json . Another dump/archive of the different releases of the BIDS schema available at https://github.com/bids-standard/bids-schema/tree/main/versions . There is python package to help with some operations on the schema in yaml form: https://pypi.org/project/bidsschematools/

@yarikoptic
Copy link
Contributor

Related, since @SylvainTakerkart is likely talking about not yet merged extension to BIDS: BEP032, the modified (buggy ATM) schema for that is being developed as part of the bids-standard/bids-specification#1705 where we would add some more metadata fields etc.

I think it might be worth organizing some brief zoom meetup to chat about it -- I would be happy to answer any questions about BIDS, and hopefully assist in deciding on the level of "BIDS support" you decide to add to elabftw (e.g. along the ideas of "BIDS standardized content"). A fresh brief paper which might give you more background on "how come BIDS" is https://direct.mit.edu/imag/article/doi/10.1162/imag_a_00103/119672 which points you to one of the non-going-away difficulties is exactly that -- preparation of BIDS datasets, and the number of converters we have, typically specialized in some specific modality(ies) to cover: https://bids.neuroimaging.io/benefits.html#converters , e.g. our https://github.com/nipy/heudiconv/ for neuroimaging data conversion from DICOMS. But I am a strong believer into "it can be easy if thought through ahead of time" - so we have https://github.com/ReproNim/reproin . I would be interested to see how we might even eventually make use of elabftw for neuroimaging studies too.

@NicolasCARPi
Copy link
Contributor

Yes, I'm open for a little chat about this. Could I ask that you send your disponibilities, with Sylvain in copy, to the email address visible on this page: https://www.deltablot.com/contact/ ?

This week, Thursday afternoon is free on my side. So if Sylvain and you are available this thursday at 2PM CEST or later, that would work.

Cheers,
~Nicolas

@SylvainTakerkart
Copy link
Author

Related, since @SylvainTakerkart is likely talking about not yet merged extension to BIDS: BEP032

Actually, we'd like to think in a generic manner: we'd like to be able to have an automatic eLab (extra-fields) form generator (said otherwise: a template experiment) for any given modality supported in BIDS... Somehow answering the question: "I need a form for anatomical MRI", or "I need a form for microscropy"... (at first, we would assume that a given experiment corresponds to a single data modality; but this would need to be generalized soon after...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants