Entering edit mode
6.3 years ago
tim.ivanov.92
▴
40
What is the easiest way to transform local coordinates on a protein/transcript to genomic coordinates (given accession number)?
Maybe something like that will be helpful?
Mapping between genome, transcript and protein coordinates
https://bioconductor.org/packages/devel/bioc/vignettes/ensembldb/inst/doc/coordinate-mapping.html
It can, but do you know if it is possible to switch automatically to different organism/assemblies? It seems like the same problem here as with cruzdb - you can easily do it if you know the organism and assembly, but if you have only blast record - can you generate a url for reference?
Please paste what you have, currently, and what you would like it to be. You have
python
,cruzdb
, anducsc
as tags - please elaborate why.CruzDB: https://github.com/brentp/cruzdb
Calling brentp
1) I've made tblast of a certain protein and have a list of alignments of my reference protein. 2) With accession numbers of these proteins i've download the coding sequences for them (through biopythons functions to get genbank format through accessions) and aligned them instead of proteins. 3) Knowing (through cruzdb) the annotation of these transcripts i can extract exons coordinates and find any region of interest inside transcript alignments, but i'm interested in finding the corresponding regions inside their genomes.
So my questions is actually twofold: first, is it possible to obtain genome regions through blast hits automatically? second one is about cruzdb functionality: I can download any genome region if a have a url:
But can i transform programmatically an accession number into a link to reference genome?
Maybe https://mutalyzer.nl/position-converter if we talk about human?
unfortunately, we are talking about fruit fly dm3 and also i need a command line tool - a web server would be hard to implement in a pipeline