Fastp

0

Entering edit mode

2.9 years ago

nishimalhotra2612 ▴ 50

Does fastp remove duplicated reads or not because I have been trying to remove the duplicate reads by using -D or --dedup but it's not working so can someone tell me if it's still available?

Fastp • 2.1k views

ADD COMMENT • link updated 17 days ago by GenoMax 150k • written 2.9 years ago by nishimalhotra2612 ▴ 50

0

Entering edit mode

but it's not working

That does not tell us in what way is it not working. How did you determine that?

As long as you are using v.0.22 or above that feature should be available: https://github.com/OpenGene/fastp#deduplication

ADD REPLY • link 2.9 years ago by GenoMax 150k

0

Entering edit mode

i used this command

fastp -i SRR13684098_1.fastq.gz -I SRR13684098_2.fastq.gz -o 098_1.fastq.gz -O 098_2.fastq.gz -D

and it shows undefined short option: -D

but mostly for enabling any feature we use this way only

sorry may be i am wrong also

ADD REPLY • link 2.9 years ago by nishimalhotra2612 ▴ 50

0

Entering edit mode

what is your version of fastp ?

can you see the -D option in fastp --help

ADD REPLY • link 2.9 years ago by Pierre Lindenbaum 166k

0

Entering edit mode

fastp 0.23.2

no i can't see i think its the latest version and even in github there is an option for removing duplications.

ADD REPLY • link 2.9 years ago by nishimalhotra2612 ▴ 50

0

Entering edit mode

Use clumpify.sh then: Introducing Clumpify: Create 30% Smaller, Faster Gzipped Fastq Files. And remove duplicates.

ADD REPLY • link 2.9 years ago by GenoMax 150k

0

Entering edit mode

hey thanks for the suggestion but i figured it out as its rna-seq data there is no need to remove duplication as i might lose some important data

ADD REPLY • link 2.9 years ago by nishimalhotra2612 ▴ 50

0

Entering edit mode

Does clumpify.sh remove duplicates based on UMIs, aka. keep consensus or the best sequence from a group of sequences with the same UMI?

ADD REPLY • link 17 days ago by Lhl ▴ 760

0

Entering edit mode

No. clumpify.sh only uses sequence read. If your UMI is still part of that sequence then it will dedupe those reads but you should not count on it using the UMI specifically.

If you want to specifically use UMI, then try umi-tools or fastp.

ADD REPLY • link 17 days ago by GenoMax 150k

Login before adding your answer.