Extract Only Paired-End Reads That Map A Specific Interval
3
1
Entering edit mode
12.2 years ago

Hi,

Is it possible to extract paired-end reads that map to a specific interval ( from a bam file ). I tried with intersectBed :

intersectBed -abam align.bam -b interval.gff3 -wa > result.bam

here's the result :

enter image description here

But I only want reads that map to the feature in bold blue (one of the paired reads is enough). For example, I don't want the reads that map either side of this feature (red arrow).

Is it possible with intersectbed or an other program ?

Thanks,

N.

bedtools extraction • 5.8k views
ADD COMMENT
0
Entering edit mode

The image link is broken, I cannot see it.

ADD REPLY
0
Entering edit mode

Could you please provide the content of the interval.gff3 file? I might want to do a very similar analysis, any help is welcome

ADD REPLY
2
Entering edit mode
12.2 years ago

Use samtools view:

samtools view [options] in.bam "chr2:1000-078987"
ADD COMMENT
0
Entering edit mode

Thanks, but I've also the same problem (reads that map either side of the feature (red arrow))

ADD REPLY
1
Entering edit mode
12.2 years ago

for paired-end bams, you should use pairToBed instead of intersectBed. the syntax would be almost the same:

pairToBed -abam align.bam -b interval.gff3 > result.bam
ADD COMMENT
0
Entering edit mode

I tried pairToBed but I've an error with the bam file :

*****ERROR: BAM must be sorted by query name.
ADD REPLY
0
Entering edit mode

from the BEDTools' pairToBed docs: "pairToBed will fail if an aligner does not report alignments in pairs". I guess that if this is the case, you'll have to work with a compatible aligners or to sort your read pairs as requested.

ADD REPLY
0
Entering edit mode
9.1 years ago

[BEDOPS bedmap] could probably solve this by only reporting overlapping reads ("reference" elements), when the read is contained entirely within the set of map elements (the intervals of interest).

The bedmap --fraction-ref 1 option applies this 100% reference-overlap criteria, while --echo reports any qualifying reference elements (reads). We can use bash substitutions and convert2bed convenience scripts bam2bed and gff2bed to convert the reads and intervals to BED.

$ bedmap --echo --fraction-ref 1 <(bam2bed < align.bam) <(gff2bed < interval.gff3) > answer.bed

The use of bam2bed generates sorted BED-formatted reads, which makes the reads in answer.bed sorted and ready to use in downstream BED operations.

ADD COMMENT

Login before adding your answer.

Traffic: 774 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6