Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unikmer diff & uniqs: Ns #28

Open
sihellem opened this issue Nov 11, 2023 · 0 comments
Open

unikmer diff & uniqs: Ns #28

sihellem opened this issue Nov 11, 2023 · 0 comments

Comments

@sihellem
Copy link

sihellem commented Nov 11, 2023

Amazing tool! Exactly what I was looking for, and performing smoothly. (Is the tool published -I could not find much online?)

I am using unikmer as outlined in this workflow: https://www.biostars.org/p/475263/
I use it between intraspecific FASTQ libraries in order to extract reads (illumina PE150) bearing unique kmer to either libraries.

I have a few questions:

  1. how does unikmer treat Ns? I compared one library to its exact reverse complement, and unikmer diff outputted a lot of uniques for both files. I was expecting to find 0 uniques. Could this be due to Ns? How does unikmer count treat Ns?
    EDIT: removing reads with Ns prior to unikmer count produces 0 uniques between a library and its reverse complement.

  2. when extracting reads, does unikmer uniqs extract sequence bearing ONLY kmer unique to one library? Playing with the -m argument, I could see that when a reads holds two unique kmer for one library, the read is broken in two individual sequences. One could imagine that one single read could bear one kmer unique to indiv_1, another to indiv_2.
    Therefore, does unikmer uniqs would really output reads bearing ONLY kmer unique to indiv_1 (and none from indiv_2)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant