Bedtools trouble with double digit chromosomes
2
0
Entering edit mode
9.8 years ago

I'm trying to use the bedtools closest function to compare a couple of datasets. It works fine however when trying to compare ranges from chromosomes with double digits, it seems not be able to make the comparison. For example:

chr9    222268  30116164 chr9  29909450  29909600  0
chr9    222268  30116164 chr9  29926499  29926649  0
chr9    222268  30116164 chr9  30050824  30050974  0
chr10  214399  8156391     .        -1              -1             -1
chr15  19138465  20536973    .    -1             -1             -1
chr15  83671081 100021943    .   -1             -1             -1

Has anyone else had this problem and what can I do to fix it?

sequencing bedtools • 1.9k views
ADD COMMENT
2
Entering edit mode
9.8 years ago
Charles Plessy ★ 2.9k

In the absence of a test case to reproduce your problem it is hard to answer, but my gut feeling is that the files that you are using as input are not sorted in the same way. For instance, the order of chromosomes in one might be chr1, chr2, chr3, ..., chr10, chr11, ... in one file and chr1, chr10, chr11, ... chr19, chr2, chr3, ... in another one. The solution is then to sort all files the same way, for instance with sort -k1,1 -k2,2n.

ADD COMMENT
0
Entering edit mode
9.8 years ago

Your data are probably not sorted:

$ sort-bed unsorted_dataset.bed > sorted_dataset.bed
ADD COMMENT

Login before adding your answer.

Traffic: 1797 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6