bedtools -F minimum overlap, possible BUG ?
0
1
Entering edit mode
5 months ago

Hi all, I am recently working with some genomic coordinate manipulation and RNA-Seq analysis using bedtools.

I am interested in recovering the reads that overlap a specific features with a given percentage. I know that in bedtools we can use the -F flag to return (quoted from bedtools website: Minimum overlap required as a fraction of B. Default is 1E-9 (i.e., 1bp).).

However there seems to be a weird behaviour which I don't quite get.

In this example files we can make up reads

echo "@HD   VN:1.0  SO:unsorted
@SQ SN:chr1 LN:1000
r1  0   chr1    150 200 50M *   0   0   ACGTAGCTAGCTAGCTAGCTAGCTGATCGTAGCTAGCTGATCGTACGATC  *
r2  0   chr1    190 340 10M100N40M  *   0   0   AGCTAGCTAGCTAGCTAGCTAGCTGATCGTAGCTAGCTGATCGTACGATC  *
r3  0   chr1    350 400 50M *   0   0   TCGATCGTACGTAGCTAGCTGATCGTAGCTAGCTGATCGTACGATCGTAC  *
r3bella 0   chr1    190 240 50M *   0   0   AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA  *" > reads.sam

and convert to bam

samtools view -bS reads.sam > reads.bam

and some features file in bed format

echo  "chr1 100 200 exon1
chr1    198 201 exIntJunct
chr1    201 299 intron1
chr1    300 400 exon2" > exons.bed

Each read is 50bp long.

However when I do:

bedtools intersect -split -bed -wo -abam reads_sorted.bam -b exons.bed -F 0.5 > outputIntersect_split.bed

I got one single feature, which is not what I expected:

chr1    349 399 r3  255 +   349 399 0,0,0   1   50, 0,  chr1    300 400 exon2   50

For example, the feature exIntJunct is only 3nt long and should be completely overlapped (therefore the 1.0 fraction of it is covered) by r3bella read. However, this does not appear to be the case..

I am quite confused about what this -F parameter is actually doing, and how it is calculating this fraction. I tried all possible combinations but nothing seems to work so far.

Hope someone can help!

bedtools RNA-Seq • 553 views
ADD COMMENT
0
Entering edit mode

how could this cat "@HD VN:1.0 SO:unsorted..." > reads.sam could even work ?

ADD REPLY
0
Entering edit mode

OP has echo and cat confused, I think. They did not actually run the commands they posted here.

ADD REPLY
0
Entering edit mode

Sorry you are right, I got confused! I updated the post. The issue however still remains! :)

ADD REPLY
0
Entering edit mode

I can't explain it either.

Strangely, if you remove the -split flag it seems to behave as expected (at least for me).

It will also report the r3bella/exIntJunct overlap if you reduce the -F to a much lower fraction (maybe ~0.3).

ADD REPLY

Login before adding your answer.

Traffic: 1816 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6