Entering edit mode
6.1 years ago
gtasource
▴
60
I'm using Stringtie on aligned RNA-seq data, for a species without a GTF file. Problem is, Stringtie is just predicting way too many genes (~100K.) What kind of things can I do to become more stringent and reduce the amount of genes that are being predicted?
Thanks!
First of all, StringTie assembles transcripts, not genes. Given that each gene typically has multiple transcripts per gene, the number of genes is probably lower than 100K. Why do you think 100K is too much? For matters of comparison, the current human Gencode v28 release includes about 200k transcripts. Please give some details on the species.