Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bioconvert genbank2gff3 erroneous tRNA parent if not gene feature before tRNA feature #333

Open
tuspjo opened this issue Apr 4, 2023 · 0 comments

Comments

@tuspjo
Copy link

tuspjo commented Apr 4, 2023

Hi,
Thank you for making and maintaining this very useful tool.

Working with genbank output from antiSMASH6.1.1, bioconvert genbank2gff3 fails when a tRNA feature is found as the first feature on the genome. The same genome directly from PGAP does not fail, because the /gene feature is listed before the /tRNA feature. This led me to look at the files which don't fail, and it turns out that the genbank2gff3 displays the wrong parent locus_tag for tRNAs when the order of features is unexpected (like tRNA before gene), without causing errors.
I believe that genbank format has freedom of feature order, so it could be an issue to rely on /gene being found first for the gff3 conversion, not only for tRNA and not only for antiSMASH output.

Version of bioconvert

1.0.0.post0

Command or code used

bioconvert genbank2gff3 input.gbk test.gff3 --force

genbank input

 tRNA            243644..243717
                 /anticodon=(pos:243678..243680,aa:Pro,seq:ggg)
                 /inference="COORDINATES: profile:tRNAscan-SE:2.0.9"
                 /locus_tag="OIE51_01175"
                 /note="Derived by automated computational analysis using
                 gene prediction method: tRNAscan-SE."
                 /product="tRNA-Pro"
 gene            243644..243717
                 /locus_tag="OIE51_01175"

Expected behaviour

CP109073 GenBank tRNA 243644 243717 . + . ID=OIE51_01175.tRNA.1;Parent=OIE51_01175;anticodon=tRNA-Pro
CP109073 GenBank gene 243644 243717 . + . ID=OIE51_01175;locus_tag=OIE51_01175

Your results

CP109073 GenBank tRNA 243644 243717 . + . ID=OIE51_01175.tRNA.1;Parent=OIE51_01170;anticodon=tRNA-Pro
CP109073 GenBank gene 243644 243717 . + . ID=OIE51_01175;locus_tag=OIE51_01175

What you think might have happened

using /gene feature rather than locus_tag or similar, genbank2gff3 picks up the previous (and wrong) locus tag when /gene is not found before other feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant