I have UniProt canonical sequence and UniProtID. Is there any easy way (like Biopython) to get the transcript sequence for UniProt protein sequence (I also have gene name and UniProtID)? I saw many corresponding Ensemble Transcript for one UniProt entry. I just need the one corresponding to canonical sequence. Is there any easy way to find that?
Thanks
Thanks Elisabeth! I saw that "Sequence databases" are listed as cross-reference. Is that possible to retrieve the nucleotide sequence with sequence databases? I just want to get some rough results, so I may not need all corresponding residues. I want to know the all possible mutations for one protein from a transcript.
Sorry for not replying earlier.
You can indeed follow the link to EMBL/GenBank/DDBJ, but the problem is that there can be more than one such link and you may have difficulty choosing one, as described in the help document cited above. But if you do not mind which nucleotide sequence(s) to retrieve, you can of course use the protein_id or nucleotide accession number from the cross-referenced sequence databases to retrieve the sequence from the nucleotide sequence databases. In case of doubt, I suggest that you contact EMBL/GenBank/DDBJ directly about the best way to access their data programmatically.