Downloaded unpacked and built tRNAscan-SE-1.3.1
Had some problems trying to run the underlying program trnascan-1.4.
First issue, it segfaults on a header like ">scf7180000558994" because its fasta reading code is looking only for a space to terminate the name, and there isn't one. Changed that code to look for a space OR a '\n' and then it runs.
Second issue - the output has mostly hits with "nnnnnnnn" on one end or the other. Run it like this for a while and then force an exit with ^C, then examine what it found as follows
trnascan-1.4 -o Lv_tRNAs.fa seqfile
grep "potential tRNA sequence" Lv_tRNAs.fa | wc
8782 35128 1334837
grep "potential tRNA sequence" Lv_tRNAs.fa | grep nnn$ | wc
4818 19272 737638
grep "potential tRNA sequence" Lv_tRNAs.fa | grep '\ nnn' | wc
3604 14416 546644
grep "potential tRNA sequence" Lv_tRNAs.fa | grep -v 'nnn' | wc
146 584 18243
The input sequence has runs of N because it is genomic scaffold. Near as I can tell the program does something wrong when it hits NNN resulting in what looks like about 57X more false hits than potentially real hits.
Final issue. After filtering out all the matches ending with polyN the remainder is found to consist of many (6 or more) different "takes" on the same sequence region. In these minor variants various positions move a base or so one way or the other.
Is this all as it should be???
I take it that one would normally run tRNAscan-SE to avoid these issues? Does that wrapper script modify the names so that the underlying program doesn't segfault?
Thanks