Skip to content

Feat: add bbmap/clumpify#236

Merged
maxulysse merged 27 commits into
devfrom
pr/52
Jun 17, 2026
Merged

Feat: add bbmap/clumpify#236
maxulysse merged 27 commits into
devfrom
pr/52

Conversation

@maxulysse

Copy link
Copy Markdown
Member

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/seqinspector branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core pipelines lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown

Warning

Newer version of the nf-core template is available.

Your pipeline is using an old version of the nf-core template: 3.0.2.
Please update your pipeline to the latest version.

For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation.

@maxulysse maxulysse mentioned this pull request Jun 12, 2026
11 tasks
@maxulysse maxulysse changed the base branch from master to dev June 12, 2026 11:02
@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown

nf-core pipelines lint overall result: Passed ✅

Posted for pipeline commit b4daafd

+| ✅ 201 tests passed       |+
#| ❔   7 tests were ignored |#
Details

❔ Tests ignored:

✅ Tests passed:

Run details

  • nf-core/tools version 4.0.2
  • Run at 2026-06-17 11:47:48

- Add bbmap/clumpify module with local multiqc_files topic patch
- Add nf-test for bbmap/clumpify tool
- Update CHANGELOG, CITATIONS, README, docs, schema, and config
- Add Erkut Ilaslan as contributor
- Configure memory and ext.args for duplication assessment
@maxulysse maxulysse changed the title Feat: add module bbmap/clumpify Feat: add bbmap/clumpify Jun 12, 2026

@SPPearce SPPearce left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will happen with seqtk sample? Does that sample at random, or take the first N?

Comment thread docs/output.md Outdated

</details>

[BBMap Clumpify](https://jgi.doe.gov/data-and-tools/software-tools/bbtools/bb-tools-user-guide/clumpify-guide/) removes duplicates from sequencing data and creates smaller, faster gzipped FASTQ files. This is particularly useful for reducing file sizes while maintaining data quality.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[BBMap Clumpify](https://jgi.doe.gov/data-and-tools/software-tools/bbtools/bb-tools-user-guide/clumpify-guide/) removes duplicates from sequencing data and creates smaller, faster gzipped FASTQ files. This is particularly useful for reducing file sizes while maintaining data quality.
[BBMap Clumpify](https://jgi.doe.gov/data-and-tools/software-tools/bbtools/bb-tools-user-guide/clumpify-guide/) removes duplicates from sequencing data and creates smaller, faster gzipped FASTQ files. This is particularly useful for reducing file sizes while maintaining data quality. Please note that the resulting files will not be random, so tools that take the first X reads will return a biased sample.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So for now, we don't publish the reads from bbmap clumpify, I merely use the tool for duplication assessment. I will update the description

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The re-creation of fastq via bbmap clumpify was decided out of scope for now.
cf #41
I will re-assess allowing creation of such files if it could be useful for some users.
It will have to be controlled via parameters obviously

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, probably more useful in demultiplex

Comment thread workflows/seqinspector.nf Outdated
Comment thread workflows/seqinspector.nf Outdated
@maxulysse maxulysse requested a review from SPPearce June 16, 2026 13:39
Comment thread README.md Outdated

@SPPearce SPPearce left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, if you aren't actually using the fastqs, I'm not worried.

- Fix invalid parameter deduped=true to dedupe=true in test
- Add clumped FASTQ output file to docs/output.md
- Document bbmap_clumpify_args and save_bbmap_clumpify_reads usage
- Add custom tool arguments section to docs/usage.md
@maxulysse

Copy link
Copy Markdown
Member Author

So I agree with Simon's assessment that bbmap clumify might be slightly misplaced here in seqinspector, and might be a better fit in demultiplex.
That being said, I do think this current implementation, which by default works for deduplication assessment, and only produced clumpified reads when using the right params is a good in between solution.
And could be useful for users not using demultiplex, but still wanting some basic qc and fastqc file reduction.

@maxulysse maxulysse merged commit 9ced68d into dev Jun 17, 2026
21 checks passed
@maxulysse maxulysse deleted the pr/52 branch June 17, 2026 12:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

add clumpify to seqinspector

3 participants