How to extract reads that map within a defined interval from each other
2
1
Entering edit mode
8.2 years ago
dgimode ▴ 70

I have paired end sequence data which I have mapped to a reference. I need to extract reads from the bam file that map either next to each other or within a defined interval from each other, for example, between 200 and 300 bp from each other. Is it possible to do this? For instance, if my first read pair map between position 90,000 and 90,130 and a second read pair maps between 90,380 and 90,500. These 2 reads map 250 bp apart on the reference. Is it possible to extract all the reads that have this characteristic? Thanks

sam/bam file mapping extract reads • 2.3k views
ADD COMMENT
0
Entering edit mode

Did you check bedtools closestBed

ADD REPLY
2
Entering edit mode
8.2 years ago

Sure, you can do this with a few genomic set operations with BEDOPS tools:

  1. Convert your reads to sorted BED with bam2bed
  2. Use bedmap --range to find elements within the first, smaller range boundary (set A)
  3. Use bedmap --range again to find elements within the second, larger range boundary (set B)
  4. Use bedops --not-element-of to filter out elements from set B that overlap elements in set A

What's left at the end are your elements between the ranges specified for sets A and B.

For instance, to get reads that are between 200 and 300 bases from one another:

$ bam2bed < reads.bam > reads.bed
$ bedmap --range 200 --count --echo-map --delim '\t' reads.bed | awk '$1>1' | cut -f2- | tr ';' '\n' | sort-bed - > reads_within_200_bases.bed
$ bedmap --range 300 --count --echo-map --delim '\t' reads.bed | awk '$1>1' | cut -f2- | tr ';' '\n' | sort-bed - > reads_within_300_bases.bed
$ bedops --not-element-of 1 reads_within_300_bases.bed reads_within_200_bases.bed > answer.bed

The file answer.bed has reads between 200 and 300 bases of one another.

ADD COMMENT
1
Entering edit mode
8.2 years ago
dgimode ▴ 70

The BEDOPS way worked just fine for me. Thanks

ADD COMMENT

Login before adding your answer.

Traffic: 3702 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6