nanopore sequencing - fastp dedup option
0
0
Entering edit mode
3 days ago
HarperReed • 0

Hello everyone, Should I use the dedup option when filtering my reads using Fastp while working on FASTQ files generated with Nanopore sequencing technology?

-D, --dedup enable deduplication to drop the duplicated reads/pairs

thank you in advance

fastp nanopore • 287 views
ADD COMMENT
0
Entering edit mode

Do you expect there to be sequence duplication because of the type of data (are these amplicon reads)? Depending on the length of your reads this could need significant memory/compute resources. Finally, what is the reason to want to do this?

ADD REPLY
0
Entering edit mode

it's amplicon sequencing of 1.5 to 3 kb. The primary aim is to detect variants.

ADD REPLY
0
Entering edit mode

You may want to try a workflow for this purpose provided by Nanopore: https://github.com/epi2me-labs/wf-amplicon

ADD REPLY
0
Entering edit mode

thank you !

ADD REPLY
0
Entering edit mode

But in my case, should I use this option or not? I don't really have time to test this new tool, especially since I'm required to use specific tools.

ADD REPLY
2
Entering edit mode

I'm required to use specific tools

If tools you are planning to use include a step to mark duplicates after alignments then you don't need to do this upfront.

Like I said above, depending on length of your amplicons (and amount of data) fastp may need compute resources to do the deduplication. You could give it a try and see if it works with nanopore data.

BTW; there is a fastplong version meant for long read data, but it does not have the -D option implemented as of now (https://github.com/OpenGene/fastplong?tab=readme-ov-file#all-options ).

ADD REPLY

Login before adding your answer.

Traffic: 3222 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6