Does Windowbed Extend Reads?
1
0
Entering edit mode
11.2 years ago
bede.portz ▴ 540

I am using WindowBed, part of the BedTools suite, to align reads to a reference file and I obtained a very interesting result. I am trying to rule out an analysis artifact that could be caused by extending the reads or by aligning read midpoints rather than 5' ends. It is my understanding that WindowBed aligns the 5' end of the read to the reference point, rather than extending than mapping the read midpoint, or extending the 3' end of the read and mapping the midpoint. Am I correct in this assumption, that the 5' end of the read is in fact what is being aligned?

Any help here would be appreciated. The BedTools manual, which is very good, doesn't seem to address this.

Thanks

bedtools chip-seq • 2.9k views
ADD COMMENT
4
Entering edit mode
11.2 years ago

By default, windowBed looks for an overlap between the extended read and the BED (or whatever) file. The 5' coordinate of the read is extended (by default), as is the 3' end. Take as an example the following read from a BAM file and the accompanying BED file:

Read:

HWI-ST1140:102:C2G0LACXX:2:2103:6169:70808    0    1    3011264    255    9S41M    *    0    0    CTCAGAGGGACGTCAACTTCCATCCCACAGTGGTCCCTTTCTCCTTCTCT    CCCFFFFFHHHHHJJJJJJJJJIJJJIJJIIJJIIJJJJJIGIJJJIJIJ    NH:i:1    HI:i:1    AS:i:40    nM:i:0

BED file:

1    3011364    3011464

If you execute windowBed -bed -abam example.bam -b example.bed, you'll get an overlap, regardless of the fact that neither the 3' nor the 5' end of the read overlap the BED file. If you reduce the extension to 0 (windowBed -w 0 -bed -abam example.bam -b example.bed), then you'll get no overlap. All that's required is a 1bp overlap, so using -w 61 would be sufficient for the read to overlap that region.

Edit: When in doubt, make a small test case. That's almost always the quickest way to put your mind at ease.

ADD COMMENT
0
Entering edit mode

I'm not sure I follow. For clarification, I am using ChIP-seq reads, 40bp long, as a bed file. I am aligning reads to a list of transcription start sites, which is also a .bed file. The TSS list contains the TSS as a 1bp interval. This 1bp interval is extended 1000bp in each direction, creating the "window" in which the reads are aligned. I am unsure of which part of the read is aligned, i.e. the 5' end, the midpoint of the 40bp interval, the midpoint of some extended interval, etc.

Thanks for your willingness to answer my questions. I apologize if I am being unclear.

ADD REPLY
2
Entering edit mode

Yeah, that's kind of what I answered. Nothing is aligned, but rather overlapped (particularly with reads, "aligned" has a very specific meaning that has nothing to do with what you're doing). The entire read is used. If a single base overlaps (what you seem to call "align") the window, then it's counted as an overlap. Regardless of whether you're converting the aligned BAM/SAM file to a BED file and using that as the input to -b or doing something more like what I'm doing, this will be the case. Have a look at page 47 of the BEDTools user manual for a pictorial representation of what's going on.

ADD REPLY
0
Entering edit mode

Props for your answer. After revisiting the manual, your initial answer was quite helpful.

I'm particularly interested in overlapping the 5' ends of the reads to a window around the TSS. Essentially extending the TSS file 1kb in either direction, which would be like using windowBed with -a TSS and -b reads which is bass ackwards.

ADD REPLY
1
Entering edit mode

Ah, are you currently using samtools view ... to get the reads into bedtools or have you written to a file already?

Edit: If the latter, something like the following should still work: samtools view example.bam | cut -f 3,4 | awk 'BEGIN{OFS="\t"}{print $1,$2-1,$2}' | windowBed -a example.bed -b stdin. Then you're getting just the 5' end of the reads as the input to -b. I suspect that this isn't really the appropriate solution to the biological question that you're trying to answer, but lacking further details I'll leave that to you.

ADD REPLY

Login before adding your answer.

Traffic: 2274 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6