Question

Interpreting "Intersectbed" Output

1

Entering edit mode

13.2 years ago

Atom Smasher ▴ 20

Hello,

I have been trying to interpret the output produced by the "intersectBed" tool from the BEDTools suite. I could not find any documentation related to interpreting the output.

I used the basic command for intersectBed

ie. intersectBed -a file1.bed -b file2.bed

Here are some lines for the output :

chr8    579281  579420  .       375     .       4.88051 2.88614 2.16931 79
chr1    936133  936483  .       748     .       8.58236 15.39442 12.80621  181

While the first 3 columns describe the chromosome number, the start and the stop co-ordinates, I am unable to figure out what the rest of the numbers mean.

Any help on this would be much appreciated :)

Thank you.

AB

intersect output • 8.3k views

ADD COMMENT • link updated 13.2 years ago by Gjain 5.8k • written 13.2 years ago by Atom Smasher ▴ 20

0

Entering edit mode

Cross-posted here.

ADD REPLY • link updated 5.6 years ago by Ram 45k • written 13.2 years ago by Aaronquinlan 12k

0

Entering edit mode

What do file1.bed and file2.bed look like?

ADD REPLY • link 13.2 years ago by Aaronquinlan 12k

0

Entering edit mode

Hello Aaron, Both input bed files just have the chromosome number, start and stop co-ordinates. i.e

chr start stop

ADD REPLY • link 13.2 years ago by Atom Smasher ▴ 20

0

Entering edit mode

Hello Aaron, Both input bed files are just tab delimited files containing the chromosome number, start and stop coordinates.

ADD REPLY • link 13.2 years ago by Atom Smasher ▴ 20

0

Entering edit mode

try 'intersectBed -a file1.bed -b file2.bed -wa -wb'. It'll give you the overlapping entries of both files

ADD REPLY • link 13.2 years ago by David Langenberger 11k

0

Entering edit mode

Hello Aaron,

I'd like to know more about the score in the 5th column (ranging from 0 to 1000). Is that a score for the overlap between the two bed files ?

I intend to "rank order" the intersecting regions from the two bed files for my analysis to find out the regions with the most significant intersections.

Can I simply use the score to rank order the regions ? For instance, all the regions with a score of 1000 are more significant than those with a score less than 1000.

ADD REPLY • link 13.2 years ago by Atom Smasher ▴ 20

score 2 · Answer 1 · 2012-03-06

Hi Atom,

Its just the bed format. Main point here is that the start and end coordinates are the common(overlapped) coordinates for the two bed files.

So basically:

The first three required BED fields are:

chrom - The name of the chromosome (e.g. chr3, chrY, chr2_random) or scaffold (e.g. scaffold10671).
chromStart - The starting position of the feature in the chromosome or scaffold.
chromEnd - The ending position of the feature in the chromosome or scaffold.

The 9 additional optional BED fields are:

name - Defines the name of the BED line.
score - A score between 0 and 1000.
strand - Defines the strand - either '+' or '-'.
thickStart - The starting position at which the feature is drawn thickly.
thickEnd - The ending position at which the feature is drawn thickly.
itemRgb - An RGB value of the form R,G,B (e.g. 255,0,0).
blockCount - The number of blocks (exons) in the BED line.
blockSizes - A comma-separated list of the block sizes.
blockStarts - A comma-separated list of block starts.

Example: Here's an example of an annotation track that uses a complete BED definition:

track name=pairedReads description="Clone Paired Reads" useScore=1
chr22 1000 5000 cloneA 960 + 1000 5000 0 2 567,488, 0,3512
chr22 2000 6000 cloneB 900 - 2000 6000 0 2 433,399, 0,3601

For more details, please look at this link.

You can download the BEDTOOLs manual from this link.

I hope this helps.