Long reads and fixing of mate-pair issues/marking duplicates with samtools
1
0
Entering edit mode
6 months ago
Zeng Hao ▴ 40

Hi everyone,

I am trying to call for structural variants (using svim) in my PacBio long reads sequencing dataset. However, I noticed that I get a vastly different number of variants (100,000 vs 1,000) when I used a bam alignment (from ngmlr) directly converted with samtools versus one that I processed with samtools fixmate and samtools markdup prior (significantly less in the latter). (Workflow: https://www.htslib.org/workflow/fastq.html)

Is this normal? And are these steps necessary for this specific use case (SV calling)? (Frankly I do not quite understand what impact the samtools fixmate or samtools markdup might be)

Thank you very much for your help.

[Edited for clarity]

Best regards,

ZH

mate-pair samtools alignment • 412 views
ADD COMMENT
3
Entering edit mode
6 months ago
aw7 ▴ 340

I do not think samtools fixmate and markdup are going to work on PacBio long reads. fixmate is for setting and repairing mate information for read pairs which (as far as I know) PacBio does not have. markdup might do something useful with the right settings, but I do not know if anyone has ever tried it properly.

If they are not helping you then I would not use them.

ADD COMMENT

Login before adding your answer.

Traffic: 2640 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6