Extract all paired reads from sam file
1
0
Entering edit mode
7.2 years ago
Picasa ▴ 650

Hi,

I had to filter manually a sam file by keeping only reads that fall into a specific regions.

Using flagstat, my sam file is looking like this now:

160995 + 0 in total (QC-passed reads + QC-failed reads)
1148 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
159950 + 0 mapped (99.35% : N/A)
159847 + 0 paired in sequencing
79922 + 0 read1
79925 + 0 read2
142206 + 0 properly paired (88.96% : N/A)
157757 + 0 with itself and mate mapped
1045 + 0 singletons (0.65% : N/A)
10605 + 0 with mate mapped to a different chr
6694 + 0 with mate mapped to a different chr (mapQ>=5)

As you can see, there is not an even number of reads and the number of read1 is different of read2.

I would like to extract only paired reads (and make a new sam file) whenever they are mapped or unmapped.

I have tried using the flag -f 1 but it gaves the same result.

Anybody have solution ?

Thanks for your help.

sam paired • 2.1k views
ADD COMMENT
0
Entering edit mode

do you want to extract paired end reads that map to reference in to fastq files?

ADD REPLY
1
Entering edit mode
7.2 years ago
ATpoint 85k

Probably by subsetting to a specific region, one of the mates was excluded, while the other one is still present and flagged as paired. Use fixmate to update the flags, then rerun the filter:

samtools sort -n -l 0 -O bam in.sam | samtools fixmate - - | samtools view -h -f 1 -o out.sam
ADD COMMENT
0
Entering edit mode

Thanks,

It looks like to work.

Just a quick question:

if a read is present and its mate has been excluded, with your command, does it remove the paired or reinclude the mate ?

ADD REPLY
1
Entering edit mode

Exclude. Fixmate will flag the read as unpaired (this is why sorting by name is necessary, so fixmate can check if mates are both present) and -f1 will remove it.

ADD REPLY

Login before adding your answer.

Traffic: 2015 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6