Entering edit mode
3 months ago
静
•
0
Hi to all. I have a ONT cDNA PCR reads mapping result in bam format. I use picard to remove PCR duplicate reads:
I found that there are too many duplicates. I don't know whether picard is effective for ONT data. Does anyone have any experience in using other tools?
I don't have much experience with ONT cDNA data processing, but is this really that surprising? If you've done a good job in the wet lab steps, you should have full transcripts being sequenced at a reasonable depth. Most transcripts can be spanned by a single ONT read, so you're expecting a lot of duplicated reads. This is especially true for short transcripts and if the system doesn't have that much variation.
That said, I would check the distribution across the transcriptome. If you have coverage across most transcripts this could be real. I'm more alarmed you managed to get 100% mapping rate. No artefacts or unannotated data?
My 100% mapping rate is the bam file filtered by samtools, and then used for Picard.
hello,
I would like to suggest that you could try barcode specific adapter trimming by Porechop. this sometimes reduce duplication caused by technical artifacts.