Hi @ll!
From a paper, I have obtained a list of ~10 predicted sequences for specific proteins (predicted from transcriptomes). Now I would like to match these sequences with the reference genome so as to reveal annotation data (not only "position" but also scaffold ID, score, strand and frame - see https://www.ensembl.org/info/website/upload/gff.html ) that i can use to extend my already present annotation GTF file on this species. However, I do not know of any software that could be used for that purpose. Could someone here point me into the correct direction? I would like to add that I am completely unfamiliar with python so "writing a custom python script" is not an option for me unfortunately.
Thank you for your help!
Joe
You can use
blast+
orblat
to align those sequences back to the reference genome. If the genomes are available at NCBI/Ensembl you can do this using the appropriate web interface for blast. If not, you will need the do the search locally.Looks like
GeMoMa
could work http://www.jstacs.de/index.php/GeMoMa . It takes in the protein sequence.