Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String identifier for user bins #356

Open
pirovc opened this issue Aug 3, 2023 · 1 comment
Open

String identifier for user bins #356

pirovc opened this issue Aug 3, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@pirovc
Copy link

pirovc commented Aug 3, 2023

I suggest to have an option to tag user bins with a name/identifier. That would facilitate integration with other tools and downstream analysis. Currently, AFAIK, it accepts in the raptor prepare --input:

filename
or
filename1 <space> filename2 <space> filename3

The first filename of each line is used as the "identifier" of the bin. I would suggest adding two new modes:

filename <tab> identifier
and
filename1 <space> filename2 <space> filename3 <tab> identifier

Where instead of using the first filename it uses the last col (tab separeted) as an identifier for each bin.

Example: building the HIBF at species level (each species are formed by several files). Here the species taxid would be used as identifier. When I do the search, I can directly get the species out.

@pirovc pirovc added the enhancement New feature or request label Aug 3, 2023
@eseiler
Copy link
Member

eseiler commented Aug 28, 2023

Idea:

filename struct

struct files
{
    std::vector<std::string> filenames{};
    std::string identifier{};
}

Index stores std::vector<files>

Identifier with tab in output minimiser.list

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants