I have a set of cDNA-sequences representing mRNAs and I want to create a gtf/gff file of these. I have tried to use BLAST and BLAT, but I find it difficult to obtain a single reliabe hit that represents all the different exons. And especially with BLAT I get many small hits for each sequence. I also tried to map the sequences using TopHat, but for some reason it crashed, perhaps due to the long sequences.
Any suggestions on how to map such sequences to a genome with reliable information about the exon/intron junctions?
Thanks, I'll check it out. Does it also create an entry representing the entire gene in the gff-file?
Aligners do not really have the concept of a gene, only transcript. If you want to treat each transcript as a gene, that can be done pretty easily by adding a tag to the resulting GFF annotation column in a post-alignment step.