OMA browser is very useful for my work of ortholog analysis. However, I face an issue regarding ID mapping. I searched the identifier mapping files but couldn't succeed. Specifically, how can I convert a big list of Oryza sativa uniprot id (for example, Q10M50) to RGAP id (for example, LOC_Os03g20700)? Thanks!
On the OMA browser there is no direct mapping between UniProtKB/TrEMBL accessions and RGAP ids. However, you can download two mapping files in the download section of the current release plant mapping and mapping to UniProt. By combining the two mappings based on the common OMA-ID you can establish a direct mapping from uniprot to rgap. In python you could do something like this to produce a mapping file from uniprot ids to the plant ids:
import csv
import collections
up2plant = collections.defaultdict(list)
with open('oma-uniprot.txt', 'r') as up, open('oma-plants.txt', 'r') as plant, open('up-plants.txt', 'w') as out:
up_reader = csv.reader((row for row in up if not row.startswith('#')), delimiter='\t')
plant_reader = csv.reader((row for row in plant if not row.startswith('#')), delimiter='\t')
out_writer = csv.writer(out, delimiter='\t')
up2oma = collections.defaultdict(list)
for row in up_reader:
up2oma[row[0]].append(row[1])
for row in plant_reader:
if row[0] in up2oma:
for up_id in up2oma[row[0]]:
out_writer.writerow([up_id, row[1]])
Thanks @adrian.altenhoff. It was the source files I was looking for.!!