Entering edit mode
8.8 years ago
ahaswer
▴
150
Hello, could you recommend some tool for mapping transcripts on reference genome. By transcripts I mean FASTA sequences 150-8000 bp long. I've tried GMAP and Blat but without satisfying effect.
What was unsatisfactory about the results? It is normal to have a percentage of de novo assembled transcript not map the genome or partially map depending on how complete or how fragmented the genome is and the coverage of your transcripts.
Exonerate http://www.ebi.ac.uk/about/vertebrate-genomics/software/exonerate-user-guide, est2genome option will help you. It will throw result in gff form as well. Hope this will help.
While using GMAP or Blat only 5-8% of overall transcripts maps to the genome. Most of them are rather short. The reference genome is 100% complete, without any fragmentation. Also coverage of transcripts is sufficient. I just wonder if there is any dedicated software for such long sequences. I will also try to use exonerate as mini.bioinfos suggested.
I think you can take a look at the below question fasta as input file : mapping sequences to a genome
However I would echo with Damian that if it is de novo assembly then partial mapping is a likely scenario and that really depends on the fragmentation of genome and how much deep is your sequencing coverage. Bwa aligner readily works with FASTA sequence as well. If you do not have large transcripts you can use BLAT to map directly to the genome in UCSC. Take a look at the link and see the responses.