Conversation
|
Warning Newer version of the nf-core template is available. Your pipeline is using an old version of the nf-core template: 3.0.2. For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation. |
|
- Add bbmap/clumpify module with local multiqc_files topic patch - Add nf-test for bbmap/clumpify tool - Update CHANGELOG, CITATIONS, README, docs, schema, and config - Add Erkut Ilaslan as contributor - Configure memory and ext.args for duplication assessment
SPPearce
left a comment
There was a problem hiding this comment.
What will happen with seqtk sample? Does that sample at random, or take the first N?
|
|
||
| </details> | ||
|
|
||
| [BBMap Clumpify](https://jgi.doe.gov/data-and-tools/software-tools/bbtools/bb-tools-user-guide/clumpify-guide/) removes duplicates from sequencing data and creates smaller, faster gzipped FASTQ files. This is particularly useful for reducing file sizes while maintaining data quality. |
There was a problem hiding this comment.
| [BBMap Clumpify](https://jgi.doe.gov/data-and-tools/software-tools/bbtools/bb-tools-user-guide/clumpify-guide/) removes duplicates from sequencing data and creates smaller, faster gzipped FASTQ files. This is particularly useful for reducing file sizes while maintaining data quality. | |
| [BBMap Clumpify](https://jgi.doe.gov/data-and-tools/software-tools/bbtools/bb-tools-user-guide/clumpify-guide/) removes duplicates from sequencing data and creates smaller, faster gzipped FASTQ files. This is particularly useful for reducing file sizes while maintaining data quality. Please note that the resulting files will not be random, so tools that take the first X reads will return a biased sample. |
There was a problem hiding this comment.
So for now, we don't publish the reads from bbmap clumpify, I merely use the tool for duplication assessment. I will update the description
There was a problem hiding this comment.
The re-creation of fastq via bbmap clumpify was decided out of scope for now.
cf #41
I will re-assess allowing creation of such files if it could be useful for some users.
It will have to be controlled via parameters obviously
There was a problem hiding this comment.
To be honest, probably more useful in demultiplex
Co-authored-by: Simon Pearce <24893913+SPPearce@users.noreply.github.com>
Co-authored-by: Maxime U Garcia <max.u.garcia@gmail.com>
SPPearce
left a comment
There was a problem hiding this comment.
Ok, if you aren't actually using the fastqs, I'm not worried.
- Fix invalid parameter deduped=true to dedupe=true in test - Add clumped FASTQ output file to docs/output.md - Document bbmap_clumpify_args and save_bbmap_clumpify_reads usage - Add custom tool arguments section to docs/usage.md
|
So I agree with Simon's assessment that bbmap clumify might be slightly misplaced here in seqinspector, and might be a better fit in demultiplex. |
PR checklist
nf-core pipelines lint).nextflow run . -profile test,docker --outdir <OUTDIR>).nextflow run . -profile debug,test,docker --outdir <OUTDIR>).docs/usage.mdis updated.docs/output.mdis updated.CHANGELOG.mdis updated.README.mdis updated (including new tool citations and authors/contributors).