Stringtie skips some reference genes
1
0
Entering edit mode
2.3 years ago

I have mapped data using STAR and know trying to generate assembly using stringtie, the reference annotation gtf file contains 38464 genes and 47387 transcripts. When I tried to assemble just known genes (38464 genes) and transcripts (47387) using the following command:

stringtie -p 8 -e -B -G Genome_annotation/data.gtf -o Path_to_Assembly/GFP2/GFP2.gtf Path_to_Mapped_files/GFP2/GFP2Aligned.sortedByCoord.out.bam

Stringtie assign gene ids to around 23139 genes instead of 38464 genes and 473876 transcripts. Next I check the GENE TYPE of the ids around 15000 genes which were missed by stringtie and gene type of most of them is PSEUDO. Is it possible to quantify expression of all 38464 genes?

Any help will be highly appreciated.

Stringtie Assembly RNA-seq • 838 views
ADD COMMENT
0
Entering edit mode
2.3 years ago

If using RNA, I would expect not having the complete set of genes being expressed

ADD COMMENT
0
Entering edit mode

I agree and in that case, their expression should be zero but those gene ids should be present in the stringtie assembly

ADD REPLY

Login before adding your answer.

Traffic: 2516 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6