Are there any good available programs/scripts for analyzing assembled transcripts? I imagine something like a script to blastn, blastx and tblastx each transcript and report the best hit. Something like that wouldn't be too hard to write, but I don't want to re-invent the wheel, and I also am concerned that sometimes the highest scoring hit reported by blast has a lot of gaps and is not the right result, while a lower scoring shorter hit is more likely correct, but the only way I can think of to accurately determine this is manually.
The genome I am interested in is poorly annotated, and particularly bad in my region of interest, so just using a reference gtf with my cufflinks transcripts would not be very helpful.
I forgot to check back here, sorry. I got shown the software blast2go on seqanswers, which does what I want, but unfortunately is quite slow and cumbersome. I am currently working on a (crappy) python script to do this for me, for now with Cuffdiff output. My approach is:
1) Take only significant hits in isoform_sig.diff and convert it to a bed file taking each TCONS as a gene (one line). 2) use fastaFromBed in the fastx toolkit to create a fasta file. 3) Run that through my script using wwwblast/qblast in biopython to slowly blast against the database of my choice on NCBI.