Question

Find closest gene to chromosome location

0

Entering edit mode

9.4 years ago

D H ▴ 30

Hello!

I started a gene enrichment analysis (I haven't done this before), and I have a dataset that contains the gene expression data. This data set has a column with the gene names.

However, there are some entries in that columns, which represent chromosome locations instead of gene names. I want to find the gene closest to these chromosomes locations.

I'm using R 2.15.2 (if that helps).

What are my options?

Thank you in advance!

gene R • 5.3k views

ADD COMMENT • link updated 2.7 years ago by Ram 45k • written 9.4 years ago by D H ▴ 30

Ram · Answer 1 · 2015-12-01

2

Entering edit mode

9.4 years ago

igor 13k

bedtools closest

ADD COMMENT • link updated 5.4 years ago by Ram 45k • written 9.4 years ago by igor 13k

Ram · Answer 2 · 2015-12-01

1

Entering edit mode

9.4 years ago

Nicolas Rosewick 11k

bedops closest-features: http://bedops.readthedocs.org/en/latest/content/reference/set-operations/closest-features.html

ADD COMMENT • link updated 5.4 years ago by Ram 45k • written 9.4 years ago by Nicolas Rosewick 11k

0

Entering edit mode

I see that this utility takes as an input a BED file. I only have a csv file though.

Is it possible to convert csv to BED?

(I'm sorry for the questions but I'm very new to this)

ADD REPLY • link updated 5.4 years ago by Ram 45k • written 9.4 years ago by D H ▴ 30

1

Entering edit mode

The most simple version of a BED file is three tab-separated columns (chr, start pos, end pos). You can use Excel to extract those three columns and save as tab-delimited text file.

ADD REPLY • link 9.4 years ago by igor 13k

1

Entering edit mode

[Disparaging comment about Excel deleted]

ADD REPLY • link 9.4 years ago by harold.smith.tarheel ★ 5.0k

0

Entering edit mode

If you use Excel, be sure to clean it up. It can save tab-delimited text files, but with non-Linux line endings.

You can do the following fix, in the case of exporting from Excel on Mac:

$ tr '\r' '\n' < input.fromExcelForMac.txt | sort-bed - > input.fixed.bed

If you exported your data from Excel on Windows, apply this post-save fix:

$ tr -d '\r' < input.fromExcelForWindows.txt | sort-bed - > input.fixed.bed

Then run closest-features to query features of interest:

$ closest-features --closest input.fixed.bed features.bed > answer.bed

ADD REPLY • link updated 5.4 years ago by Ram 45k • written 9.4 years ago by Alex Reynolds 36k