I am doing a bunch of deep comparisons with the lastz tool, between a few distantly-related species. I can find quite a few conserved sequences that I wish to classify them as coding vs. non-coding. This is relatively straightforward with the gene predictions. For non-coding stuff, I can already see quite a lot of tRNAs, centromeric repeats and some ancient retro-transposons. In other words, the sequences can be further divided into sub-classes.
Now what I can think of is to compare these sequences to NCBI nr database, and hope to get some textual annotations. But is there a better way?
I am not sure how this will influence lastz ability to detect conserved stretches of DNA, but what about running lastz with already repeat-masked genomes? That way you should still get some coding sequences but the non-coding ones should be more interesting than common repeats. One can also think about clustering sequences with uclust or CD-hit before blasting.