Python Framework For Converting Genomic To Protein Coordinates
1
1
Entering edit mode
12.3 years ago
user ▴ 950

Is there a framework for converting between genomic coordinates and protein coordinates, given a transcript (i.e. a list of exon coordinates)? To go from CDS coordinates to amino-acid coordinates.

The trick is to do this correctly even when the transcript is on the minus strand, which would mean the highest coordinate (not lowest) indicates the start amino acid.

I saw the BioPython related frameworks (like this http://biopython.org/wiki/Coordinate_mapping and this https://gist.github.com/3172753) but I'd prefer not to rely on all of BioPython just for the coordinate transform. I also am not sure how BioPython handles the strandedness.

Apparently PyGr can do this but with ORF containing transcripts but I've never seen an example and cannot see how it can be done from the documentation.

Any pointers to frameworks that do this correctly or examples would be helpful.

biopython protein coordinates python • 5.2k views
ADD COMMENT
2
Entering edit mode
12.3 years ago

for the algorithm protein->genomic it using the UCSC knownGene database. see this java code: http://plindenbaum.blogspot.fr/2011/03/mapping-mutation-on-protein-to-genome.html

for the algorithm genomic->protein: see this previous post: How to calculate the protein change and codon position within a nucleotide sequence of a single nucleotide substitution?

ADD COMMENT

Login before adding your answer.

Traffic: 2018 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6