I have mapped data using STAR and know trying to generate assembly using stringtie, the reference annotation gtf file contains 38464 genes and 47387 transcripts. When I tried to assemble just known genes (38464 genes) and transcripts (47387) using the following command:
stringtie -p 8 -e -B -G Genome_annotation/data.gtf -o Path_to_Assembly/GFP2/GFP2.gtf Path_to_Mapped_files/GFP2/GFP2Aligned.sortedByCoord.out.bam
Stringtie assign gene ids to around 23139 genes instead of 38464 genes and 473876 transcripts. Next I check the GENE TYPE of the ids around 15000 genes which were missed by stringtie and gene type of most of them is PSEUDO. Is it possible to quantify expression of all 38464 genes?
Any help will be highly appreciated.
I agree and in that case, their expression should be zero but those gene ids should be present in the stringtie assembly