Question

Gene starts with "LOC" prefix ?

4

Entering edit mode

4.9 years ago

sunnykevin97 ▴ 1000

HI

I'm working with RNA_Seq data from non-model organisms after differential gene expression analysis I find out their are lot of genes starting with prefix "LOC" and further searched in web I found out that these are genes which don't have any orthologs. I further performed downstream analysis I unable to convert LOC's into ENTREZID's/ENSEMBL ID's using clusterprofiler(bitr function). How do I proceed further for downstream analysis something line GO/KEGG analysis. Should I ignore them completely ? I had total of 7 samples after differential gene expression analysis they found to be 4501 for each sample.

If I search these ID's in NCBI I getting the gene information.

suggestions please!

RNA-Seq gene sequence • 6.2k views

ADD COMMENT • link updated 3.1 years ago by GenoMax 151k • written 4.9 years ago by sunnykevin97 ▴ 1000

0

Entering edit mode

Can you provide an example or two? Sometimes LOCs have informative aliases that you can use. If you have the Entrez Gene IDs you can fetch a list of all aliases for each of them.

ADD REPLY • link 4.9 years ago by vkkodali_ncbi ★ 3.8k

0

Entering edit mode

https://www.ncbi.nlm.nih.gov/search/all/?term=LOC117740983 https://www.ncbi.nlm.nih.gov/gene/117726460 https://www.ncbi.nlm.nih.gov/search/all/?term=LOC117746502

ADD REPLY • link 4.9 years ago by sunnykevin97 ▴ 1000

0

Entering edit mode

The number after the LOC is the EntrezID. You can access these entries by the URL https://www.ncbi.nlm.nih.gov/gene/{number}

For your examples,

LOC117740983    117740983    https://www.ncbi.nlm.nih.gov/gene/117740983
LOC117726460    117726460    https://www.ncbi.nlm.nih.gov/gene/117726460
LOC117746502    117746502    https://www.ncbi.nlm.nih.gov/gene/117746502

ADD REPLY • link 4.9 years ago by Ram 45k

0

Entering edit mode

I thought the same, thanks for the suggestions.

ADD REPLY • link 4.9 years ago by sunnykevin97 ▴ 1000

0

Entering edit mode

Hi, this is very useful, thanks. How would I go about running GO enrichment analysis with this list?

ADD REPLY • link 3.1 years ago by Lucía • 0

0

Entering edit mode

Since LOC genes are uncharacterized there is likely no way to do GO enrichment analysis on those.

ADD REPLY • link 3.1 years ago by GenoMax 151k

score 2 · Answer 1 · 2020-07-11

2

Entering edit mode

4.9 years ago

vkkodali_ncbi ★ 3.8k

These don't appear to have useful gene symbols. But you can get names for these using Entrez Direct as follows:

esearch -db gene -query 'LOC117740983' | esummary | xtract -pattern DocumentSummary -element Id,Name,Description

ADD COMMENT • link 4.9 years ago by vkkodali_ncbi ★ 3.8k

0

Entering edit mode

Most LOC entries are uncharacterized locations, at least on human and mouse genomes.

ADD REPLY • link 4.9 years ago by Ram 45k

0

Entering edit mode

How do I run a batchmode using esearch ? I had morethan 1000 geneID's in a file, using the above command. Suggestions please.

ADD REPLY • link 4.8 years ago by sunnykevin97 ▴ 1000

5

Entering edit mode

Use epost method. Put your queries in a file, one per line.

$ more tt
117740983
117726460
117746502

$ epost -db gene -input tt | esummary | xtract -pattern DocumentSummary -element Id,Name,Description
117746502   LOC117746502    neuropilin-1a-like
117740983   LOC117740983    transmembrane protein 230-like
117726460   LOC117726460    NADH-cytochrome b5 reductase 3

ADD REPLY • link 4.8 years ago by GenoMax 151k