Entering edit mode
6.9 years ago
naveenkumarv40
▴
30
Hello guys, I am new to RNA-SEQ data analysis. I am following the protocol submitted in nature (https://www.nature.com/articles/nprot.2016.095]). I was completed those steps given in the protocol, but i didn't get the novel genes and novel transcripts for my samples. i was running the following command
gffcompare –r chrX_data/genes/chrX.gtf –G –o merged stringtie_merged.gtf
and i got the result file as
#= Summary for dataset: stringtie_merged.gtf
# Query mRNAs : 253181 in 70635 loci (216728 multi-exon transcripts)
# (23264 multi-transcript loci, ~3.6 transcripts per locus)
# Reference mRNAs : 216257 in 60158 loci (189357 multi-exon)
# Super-loci w/ reference transcripts: 51791
#-----------------| Sensitivity | Precision |
Base level: 100.0 | 93.0 |
Exon level: 99.9 | 93.6 |
Intron level: 99.4 | 94.0 |
Intron chain level: 99.7 | 87.1 |
Transcript level: 99.8 | 85.2 |
Locus level: 100.0 | 84.4 |
Matching intron chains: 188838
Matching transcripts: 215729
Matching loci: 60158
Missed exons: 0/623537 ( 0.0%)
Novel exons: 23741/674273 ( 3.5%)
Missed introns: 2160/383827 ( 0.6%)
Novel introns: 4747/405880 ( 1.2%)
Missed loci: 0/60158 ( 0.0%)
Novel loci: 10239/70635 ( 14.5%)
Total union super-loci across all input datasets: 70632
253181 out of 253181 consensus transcripts written in merged.annotated.gtf (0 discarded as redundant)
from this .stats file , where i can find the novel genes and transcripts?
there will be 6 files generated after this step. GTF LOCI STATS REFMAP TMAP TRACKING
what u have given is a stat file content. go through the TMAP file carefully and there will be one column with class codes, which describes the assembled transcripts compare to reference annotation.