Finding single-organism orthologs using NCBI Gene ID
1
1
Entering edit mode
3.8 years ago
sysboolean ▴ 90

Hi all,

I am reanalyzing a published RNA-seq dataset of a non-model organism (eel) and I have run into a roadblock regarding gene identifiers and finding orthologs.

I generally use Ensembl biomart to find one2one orthologs between the query and target species (for eg: dog -> human/mouse), and then I use these target identifiers for performing GO/GSEA/IPA downstream of differential expression analysis. However, the current dataset is from eel which is not in Ensembl. I only have NCBI Gene IDs. My question is:

Given Gene IDs, how do I find species-specific orthologs for approx. 6000 genes (preferably by scripting) ? I would like to find the orthologs for the eel genes in zebrafish/human/mouse for GO/GSEA/IPA. I checked the gene_orthologs flat file at https://ftp.ncbi.nlm.nih.gov/gene/DATA/ but my taxonomy of interest (7936) is not present. While some eel genes have symbols that are identical to human/mouse, a lot of the genes in eel genome are annotated as LOC + gene id (eg: LOC118212896 ). The gene description has a "-like" suffix (eg:hexokinase-4-like), so there is high homology (but not enough for exact assignment) to other annotated genomes. If I can get the human/mouse/zebrafish ortholog gene symbols for these "-like" genes, that would let me do everything else I need to do downstream.

Thank you for any help !

RNA-Seq NCBI orthologs • 1.0k views
ADD COMMENT
1
Entering edit mode
3.8 years ago

Hi sysboolean,

There is few possibles options to find the orthologs of your electric eel genes:

  1. Infer orthology between your eel genome and your target species (mouse, rat, zebrafish, etc...) using the OMA standalone tool. First, download precomputed data for your target species using the export widget on the oma browser (https://omabrowser.org/oma/export/) then follow the instruction to add your eel genome there and run oma standalone on it (https://omabrowser.org/standalone/ for more informations). Then, you will find in outputed files the pairwise orthologs between all pairs of genomes.
  2. Map your eel sequence with their closest matches in the OMA browser then use the closest matched genes as proxy to get orthologs in any organisms present in the database. You can find this feature as an online tool at https://omabrowser.org/oma/fastmapping/
  3. Infer GOA directly for your raw sequences. You can use the GO prediction online tool in OMA (https://omabrowser.org/oma/functions/) to infers GO annotation for your whole genomes at once.

If you have any problems using the tools please read https://omabrowser.org/oma/tools/ or contact us again.

Clement

ADD COMMENT
0
Entering edit mode

Hi Clement, thanks for the tips ! I will try this and come back to accept the answer when it works.

ADD REPLY

Login before adding your answer.

Traffic: 2215 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6