Filter Paired-End Sam File For Xt:A:U
2
0
Entering edit mode
12.6 years ago
Steffi ▴ 580

Dear all,

I have a sam file (BWA output, paired-end reads). I would like to retain only reads which are "properly paired". This I would do by:

samtools view -f 0x002 file.sam > file_filtered.sam

Additionally I would like to retain only those pairs of reads where both reads have the XT:A:U tag. It is important to me that after the filterting step I still have the pairs together (so read1, read2, read1, read2, ...).

Any ideas how to do so?

Thanks for any help! Stefanie

sam bwa filter paired-end • 5.5k views
ADD COMMENT
4
Entering edit mode
12.6 years ago
Leszek 4.2k

Simple scripting will do the job:

#!/usr/bin/env python
# Report uniquely aligned pairs from BWA sam output.
import sys
i = 0
uPair = 0
lines = ''
for l in sys.stdin:
  #write header info
  if l.startswith('@'):
    sys.stdout.write( l )
    continue

  #reads
  lines += l
  i+=1
  if 'XT:A:U' in l:
    uPair += 1

  #every two reads  
  if not i % 2:
    #check if both are uniquely mapped
    if uPair == 2:
      sys.stdout.write( lines )        
    uPair = 0
    lines = ''
ADD COMMENT
0
Entering edit mode

Thanks a lot! I appreciate your help!

ADD REPLY
0
Entering edit mode

the way to thank on Biostar is to upvote and accept the answer ;-)

ADD REPLY
2
Entering edit mode
12.6 years ago

I would just use 'grep' to keep the headers and the line matching the tag:

/samtools view -h -f 0x002 file.bam |\
egrep "(^@|XT\:A\:U)"
ADD COMMENT
0
Entering edit mode

But this would (possibly) destroy the pairs: It might occur that one read of a pair is mapped with XT:A:U but the other not. In that case only one read of the pair would be retained.

ADD REPLY
0
Entering edit mode

I see, I misunderstood your question.

ADD REPLY
0
Entering edit mode

just do additional pair-end filtering after grep: samtools view -h -f 0x002 file.bam | egrep "(^@|XT:A:U)" | samtools view -Sh -f 0x002 -

ADD REPLY
0
Entering edit mode

This does not work, because a read can still have the flag of being properly mapped but having lost its mate at the grep-step before...

ADD REPLY
0
Entering edit mode

Right, I didn't notice that. Anyway, below you have a few python lines I use for this:)

ADD REPLY

Login before adding your answer.

Traffic: 2551 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6