SAM unique position
1
0
Entering edit mode
8.6 years ago
cfarmeri ▴ 210

Hi, Biostars.

I have a SAM file with some PCR duplicates mapped by bowtie2.

The original fastq reads of these PCR duplicates are not same sequence length,

so I cannot use "samtools rmdup" command.

I would like to extract unique position sam lines with all fields of orginal sam. Anyone has a solution? thanks.

----Example---

name1 0 chr1 124344 . . . . . AGTAGGTGGGG FFFFFFFFFFF AS:i:0 XN:i:0 XM:i:0

name2 0 chr1 124344 . . . . . AGTAGGTGGGGGATT FFFFFFFFFFFFFFF AS:i:0 XN:i:0 XM:i:0

name1 0 chr1 124344 . . . . . AGTAGGTGGGG FFFFFFFFFFF AS:i:0 XN:i:0 XM:i:0

SAM • 1.5k views
ADD COMMENT
2
Entering edit mode
8.6 years ago

The original fastq reads of these PCR duplicates are not same sequence length, so I cannot use "samtools rmdup" command.

Are you sure you cannot use samtools rmdup (or even better picard MarkDuplicates)? With single end reads these tools look only for the start position of a read to call it duplicate. (In fact I wrote my own program to look at both start and end position, MarkDupByStartEnd).

ADD COMMENT
0
Entering edit mode

thanks dariober! Your program seems to be so useful. I don't know picard MarkDuplicates, I also try it.

ADD REPLY

Login before adding your answer.

Traffic: 2677 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6