Error using bedtools "intersect" command
1
0
Entering edit mode
6.1 years ago
hila • 0

Hi all,

I'm trying to intersect a vcf file (as file -a) with a bed file (as file -b). I'm getting this error:

ERROR: Received illegal bin number 4294967295 from getBin call.
ERROR: Unable to add record to tree.

After reading some previous similar questions, I made sure that my bed file is tab delimited and sorted.

When using the -sorted flag I got an error on my first position in the bed file:

Error: Sorted input specified, but the file file.bed has the following out of order record
1       2336225 2337283

Would appreciate any suggestions

Thanks
Hila

adding

My command line is:

bedtools intersect -a file_a.gz -b file_b.bed

I don't have headers in both files

software-error bedtools • 4.8k views
ADD COMMENT
1
Entering edit mode

chromosomes on the bed file are: 1, 2, 3 rather than chr1, chr2, chr3?

ADD REPLY
0
Entering edit mode

yes, does it matter? should I change it to chr1, chr2...?

ADD REPLY
0
Entering edit mode

Yes, this does matter!

All your files in your analyse pipeline should have the same naming schema. Otherwise you are running in problems like this one.

fin swimmer

ADD REPLY
0
Entering edit mode

In addition to @finswimmer response, your sorting scheme is 1, 2, 3. If the gz file is chr1, chr2, chr3, it is probably sorted lexicographically (chr1, chr10).

ADD REPLY
0
Entering edit mode

Hello, can you add your command line and the header of each files in your post please.

ADD REPLY
0
Entering edit mode

I've added it to the original post. Thanks!

ADD REPLY
0
Entering edit mode

Do sorting of both the files with sort -V giving appropriate columns (e.g,1-3 in the given file.bed) and then try .

ADD REPLY
0
Entering edit mode

Hello hila ,

how did you made the sorting of both files? Also have a look at the line before and after the one bedtools is complaining about. How does they look like?

fin swimmer

ADD REPLY
0
Entering edit mode

Hi, the sorting was done by

cat file_name|sort -k1,1n -k2,2n

The line bedtools is complaining about is the first line in the bed file, the line after it is in order Thanks Hila

ADD REPLY
0
Entering edit mode

Looks ok. Have you done it for both files?

In your initial post you wrote the filename is file_a.gz. How did you compress it? Using bgzip would be the right way.

fin swimmer

ADD REPLY
0
Entering edit mode

I got the gz file from another source, I didn't make it myself. The thing is- this code already worked with the same gz file and another bed file, so I guess the problem is with the new bed file, I just can't find it :(

ADD REPLY
0
Entering edit mode
6.1 years ago

your coordinate system is too large for bedtools. This typically happens when the chromosome is longer than 500mb https://github.com/arq5x/bedtools2/blob/e5ad7e48108681fe93ee4600d07699ab278e3c56/src/utils/BinTree/BinTree.cpp#L118

ADD COMMENT

Login before adding your answer.

Traffic: 2458 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6