Hi all,
I am trying to obtain Entrez ID's for orthologous genes between two organisms using OMA. To do that, my methods are as follows:
OMA>Tools>Genome Pair Orthology
Select Organism 1, Select Organism 2, Select "Entrez Gene IDs" for Preferred ID
Download provided file (File A) as a .tsv file. The file's headers are: "Organism 1", "Organism 2", "Orthology", "OMA Group". All of Organism 1's IDs are OMA IDs (I think, beginning with the unique identifier of the organism and 5 numbers), and Organism 2's IDs are Entrez Gene ID's (I think).
Cross reference Organism 1's IDs to UniProtKB Gene IDs (OMA>Search:Taxon ID:"Organism 1's unique identifier">List Genes)
Download the list of genes with OMA and UniProtKB IDs (File B)
Merge files A and B in R by OMA IDs (File C). Headers are "Organism 1 (OMA IDs)", "Organism 1 (UniProtKB IDs)", "Organism 2 (Entrez IDs)", "Orthology", "OMA Group"
Merge file C and another file D (with extraneous information about Organism 1 irrelevant to this post) by UniProtKB IDs (File E). Headers are "Organism 1 (OMA IDs)", "Organism 1 (UniProtKB IDs)", "Organism 1 (extraneous info...)", "Organism 2 (Entrez IDs)", "Orthology", "OMA Group"
My problem is in step 5. The organism of interest has over 4000 genes, but the list that I am downloading from OMA only has 100 genes. Is there a way to download the entire 4000+ gene list?
Thank you, Christy