Gene list from bed file
3
1
Entering edit mode
7.6 years ago
o.hickman ▴ 10

I have a .bed file with Chromosomal coordinates instead of gene names in columns 1, 2 and 3. How do I get the coesponding gene list from this information?

chr7 120792856 120792910 K562_rep01 1000 + 5.486570215 14.23664891 -1 -1

Thanks in advance!!

Oliver

ChIP-Seq iCLIP alignment sequencing • 14k views
ADD COMMENT
0
Entering edit mode

Is the gene name in your file itself? Though KHDRBS1 may be on chr1 than chr7 if that line is correct.

ADD REPLY
0
Entering edit mode

Hi, Sorry, that column is the name of the experiment. The gene name is definitely not in there. Can you help? O.

ADD REPLY
0
Entering edit mode

You can use biomaRt package to get the genes' chromesome information, then you can map your bed file to gene symbols or gene IDs.

ADD REPLY
2
Entering edit mode
7.6 years ago
e.rempel ★ 1.1k

Hi, you also can use bedtools intersect. From the online manual:

bedtools intersect allows one to screen for overlaps between two sets of genomic features.

One set is your bed file, the second file could be a GFF/GTF file with the gene model of your organism.

ADD COMMENT
0
Entering edit mode
7.6 years ago
agata88 ▴ 870

I think you can do it by UCSC Table Browser:

https://genome.ucsc.edu/

Select group: Genes and Genes predictions, RefSeq, hg19/hg38 Define regions in region field (chr:start-end), then select output format: all fields from selected table, get output.

At the end of text file you will have field name2 - gene name.

I think it is possible to filter all unnecessary informations from this file and leave only gene name, transcript etc. and save in BED file.

Best,

Agata

ADD COMMENT
0
Entering edit mode

I tried this and I would like to add something extra. Also select Track:- UCSC. I obtained multiple lines w.r.t. original co-ordinates in .bed so I collapsed it using bedtools: bedtools map -a original.bed -b ucsc_based_gene_list.bed -c 5 -o collapse

ADD REPLY
0
Entering edit mode
7.6 years ago
badribio ▴ 290

Post processing and filtering redundancies, You could try bedmap as explained by @ Alex Reynolds in this post C: how to get gene names and gene id given co ordinates from ensembl its fast and elegant.

ADD COMMENT

Login before adding your answer.

Traffic: 1517 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6