It is unclear from your question whether you want a read to be contained entirely within a region, or a region contained entirely within a read, when counting reads.
But that's okay! You can use the --count
operator in BEDOPS bedmap
with BEDOPS bam2bed
to efficiently calculate either result, specifying different overlap criteria in either case.
First, use BEDOPS sort-bed
to make sure your input regions are ready for use with bedmap
:
$ sort-bed unsortedRegions.bed > sortedRegions.bed
Preparing input with sort-bed
allows BEDOPS tools to work fast and have a very low memory profile, compared with alternatives.
We're now ready to search for reads that overlap regions.
Let's examine the first overlap scenario: Use the --fraction-map
operator with bedmap
to enforce the overlap criterion that a BAM read (a "mapped" element) must overlap the region by 100%, i.e. that a read is contained entirely within a region-of-interest:
$ bam2bed < reads.bam \
| bedmap --echo --count --fraction-map 1.0 sortedRegions.bed - \
> answer.bed
The format of answer.bed
will be:
[ region-1 ] | [ number of BAM reads entirely within region-1 ]
[ region-2 ] | [ number of BAM reads entirely within region-2 ]
...
[ region-N ] | [ number of BAM reads entirely within region-N ]
Now let's discuss the second overlap case. If, instead, you want the region-of-interest (a "reference" element) to be contained entirely within the read, then use the --fraction-ref
operator:
$ bam2bed < reads.bam \
| bedmap --echo --count --fraction-ref 1.0 sortedRegions.bed - \
> answer.bed
The format of answer.bed
will be:
[ region-1 ] | [ # BAM reads that region-1 is entirely contained within ]
[ region-2 ] | [ # BAM reads that region-2 is entirely contained within ]
...
[ region-N ] | [ # BAM reads that region-N is entirely contained within ]
You can give any fraction from 0 to 1, inclusive, to --fraction-map
and --fraction-ref
, depending on the stringency of overlap that you need.
Other overlap criteria are available, which can be useful, depending on your analysis.
You can use bedtools :
something like this (maybe someone can correct me, I'm not sure of the params):
intersectBed -abam in.bam -b region.gtf -c -f 1
You can also try htseq
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
overlap must be 100 % ! overlap must be 100 % ! http://commons.wikimedia.org/wiki/File:Female_animal_trainer_and_leopard,_c1906.jpg overlap must be 100 % ! hum... sorry... :-) have a look at GATK depth of coverage.