bedtools closest - result for only the first query
2
0
Entering edit mode
6.7 years ago
yjyj • 0

I'm trying to annotate a bed file with the closest downstream gene, but the output returns only the result for the first entry of the query file. I tried deleting the first entry and made a new file starting with the 2nd, but this one also only gave outputs for the 2nd (1st) entry. I have 1000 entries and don't want to repeat this.

I have also done/tried:

  • sorting the query file
  • bedtools closest -iu -D ref -a query.bed -b mm10gencode_sorted.bed
  • bedtools closest -fd -D ref -a query.bed -b mm10gencode_sorted.bed
  • bedtools closest -a query.bed -b mm10gencode_sorted.bed
  • bedtools intersect -a query.bed -b mm10gencode_sorted.bed -wa -wb

Has this happened to anyone else? Or any ideas would be appreciated!

bedtools closest software error bed • 3.7k views
ADD COMMENT
0
Entering edit mode

Intersect gives only the overlapping fields. The query files must also be sorted, try

bedtools closest -a query.bed -b mm10gencode_sorted.bed -D a
ADD REPLY
0
Entering edit mode

I've always sorted the query file. This code also didn't work..

PS. I only tried intersect because closest didn't work properly, but both gave output for the first entry. So it seems there's a problem with the file, I just have no idea what's wrong

ADD REPLY
0
Entering edit mode

That's odd, was the bedtools installation ok? I had a problem before with v2.26 due to faulty installation, can you host the files somewhere to check?

ADD REPLY
0
Entering edit mode

I just deleted bedtools and reinstalled v2.27.1, but they gave the same results. But what do you mean by hosting files? Thanks for your input!

Some more updates: When I put another query file in, it worked fine! So I now suspect something went wrong while making the query bed file (exporting Excel to a tab-delimited .txt), which is weird because I've done the same thing many many times. Do you know other ways to make bed files?

ADD REPLY
0
Entering edit mode
6.7 years ago

You could try BEDOPS closest-features:

$ closest-features --closest --delim '\t' intervals.bed genes.bed > answer.bed

The --closest option reports the closest element (otherwise both upstream and downstream nearest are reported). The closest element could potentially be upstream.

If you always want the downstream (rightwards) element, even if it is further in distance away from an upstream element, you can pipe the result to awk to report the reference element and its nearest downstream element:

$ closest-features intervals.bed genes.bed | awk -vFS="|" -vOFS="\t" '{ print $1, $3; }' > answer.bed

Other options are available; run the program with --help for more detail.

You can make sure that BED files are sorted with sort-bed:

$ sort-bed intervals.unsorted.bed > intervals.bed
$ sort-bed genes.unsorted.bed > genes.bed
ADD COMMENT
0
Entering edit mode
6.7 years ago
yjyj • 0

I've found the problem, but no solutions unfortunately.

I originally made the query .bed file on Excel for Mac (Office 365), and it had issues with bedtools (version 2.27.1) closest, intersect and sort. But when I made the file on a Windows computer, no problem! Very strange, because I didn't have this problem until now.

Restarting my Mac didn't help, nor did reinstalling bedtools. And according to this, it seems there are no solutions yet:

https://answers.microsoft.com/en-us/msoffice/forum/msoffice_excel-mso_mac-mso_mac2011/problem-save-an-excel-file-to-a-text-tab-delimited/d0d9c6df-1207-4adc-ac89-a625e8721b04

tl;dr There are differences between Mac and Windows in how Excel saves tab-delimited .txt

But if anyone has other solutions, workarounds for Mac computers, etc., please share :)

ADD COMMENT

Login before adding your answer.

Traffic: 1946 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6