Entering edit mode
3 months ago
analyst
▴
50
Dear all,
I have run standalone blastx using default options for my list of transcripts and got output like this:
MSTRG.285.1 KAG7595701.1 68.976 332 9 3 194 1189 26 263 5.59e-112 342
MSTRG.285.1 KAG7595701.1 90.698 43 3 1 30 158 1 42 7.53e-13 81.3
MSTRG.285.1 CAD5311629.1 68.675 332 10 3 194 1189 26 263 5.14e-111 339
MSTRG.285.1 CAD5311629.1 90.698 43 3 1 30 158 1 42 6.85e-13 81.3
MSTRG.285.1 XP_020868191.1 67.683 328 54 3 206 1189 18 293 3.48e-107 331
MSTRG.285.1 KAG7591185.1 68.339 319 49 3 230 1186 39 305 3.10e-104 324
MSTRG.285.1 CAH8251728.1 67.812 320 51 3 230 1189 39 306 3.22e-103 321
MSTRG.285.1 KAG7653704.1 68.125 320 50 3 230 1189 39 306 3.33e-103 321
MSTRG.285.1 EFH68782.1 67.812 320 51 3 230 1189 38 305 3.44e-103 321
MSTRG.285.1 OAP14613.1 86.387 191 0 1 617 1189 92 256 5.16e-70 234
Please guide which filters should I apply to get novel transcripts?
Thanks
Define "novel". If your queries are showing good "hits" (like some above) to something in the database then they are not "novel" by definition of the word.
Good hits with minimum e-value right?
Should I take care of query coverage also? If yes please guide how can I calculate query coverage from above output file.
This is the command that I used:
Is there any source from where I can get idea which threshold or criteria to set for percentage identity parameter to get novel transcripts for example transcripts less than 80 or 90 represents novelty?
I encourage you to have a look at OrthoFinder: https://github.com/davidemms/OrthoFinder
Not only will you now which transcripts are new, you will get a full classification for every transcript
Thank you biofalconch!
I am looking for novel lncRNA transcripts from Arabidopsis thaliana data.
Thats a little bit harder, since lncRNAs usually tend to evolve differently, maybe something like what's proposed in this paper would be of use?
https://www.nature.com/articles/s41598-022-18254-0
They go through various filters to make sure they don't code for proteins, really not much trying to find lncRNAs using other species....