Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remux_plus #49

Open
dpark01 opened this issue Apr 24, 2020 · 1 comment
Open

remux_plus #49

dpark01 opened this issue Apr 24, 2020 · 1 comment

Comments

@dpark01
Copy link
Member

dpark01 commented Apr 24, 2020

Thinking of adding a new WDL workflow called remux_plus.

Inputs:

  • old samplesheet (originally used)
  • new samplesheet (as it ought to be)
  • Array[File] raw_bams
  • Array[File] cleaned_bams

Maybe the name is a bit of a misnomer, the goal would not be to actually call demux again, but instead to start with one task that takes the two samplesheets and turns them into a 3-col tab file for use with read_utils.reheader_bams. This file maps old sample names, library names, and filenames to new ones. It would then scatter invocations of reheader_bams on each raw_bam and cleaned_bam. Then would maybe re-run fastqc and multiqc and spike-in counts on all of them. Probably skip the other stuff (like spades and kraken.. though we could optionally).

The goal would be to reproduce most outputs of demux_plus without actually re-running it, when all that is desired is renaming samples and libraries and files based on a new samplesheet. This would obviously not suffice for actual changes to barcodes or read structures.

@tomkinsc
Copy link
Member

Maybe it could consider the delta and actually re-demux any samples that do have barcode differences (perhaps discarding unmatched reads to save resources), in addition to re-headering any samples where the barcodes remain the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants