Dear Biostars,
I would like to use Sorghum Expression Atlas - E-MTAB-3839 data downloaded from the like https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-3839/. Previously I have downloaded Sorghum data from other databases like Phytozome etc, problem is gene ids of EMBL data are different from data downloaded from other sources. Gene id example from other dbs: Sobic.001G000100 Gene id example of ArrayExpress data: SORBI_3003G276100 Is there any way to convert or map EMBL/ArrayExpress gene id to Phytozome Sorghum gene id. Kindly help me to resolve this issue. Thank you.
The correct observation should be: the IDs from the others are different then those from EBI/EMBL ;) .
The best thing to do is to look in phytozome (?) to see if they offer alternative IDs, otherwise: have a look at the locus_tag info in the EMBL data , that one should (in theory) reflect more the IDs used by other databases
Thanks for your suggestions, unfortunately neither of them worked for Sorghum data.
OK, if you can't find a textual link between the two IDs, you can probably only fall back on creating it yourself.
One approach is: get the CDS fasta file of the annotations, both from EMBL and from phytozome and blastn them to each other and then create a correspondence table for the IDs. This will work in most cases but it's likely not gonna be a 100% waterproof approach.
I will try this approach. Thank you.