Arabidopsis thaliana RNA-Seq analysis: Is 68% transcript annotation acceptable/expected with Ensembl ref and new tuxedo pipeline
1
0
Entering edit mode
5.0 years ago
arctic ▴ 40

Dear all, I am new to the field. I have recently been using the new tuxedo pipeline (HISAT2 aligner and StringTie Assembler with "de novo" assembly) for RNA-Seq data of Arabidopsis thaliana (more details below). The pipeline in my hand has identified ~26K transcripts with ~15K being assigned a Gene Symbol from the reference gtf. I wonder if this ratio (68% of transcripts being assigned gene symbols) is within expected range? If you have experience with Arabidopsis RNA-Seq data, your input is appreciated.

Thank you for your reply beforehand.

More details on the data (if needed): - Samples: 18 - RNA Prep: SMART-Seq® v4 Ultra® Low Input RNA Kit for Sequencing (Clontech) - Library Prep: Nextera® DNA Library Prep (Illumina) - Seq: NextSeq500 sequencing - Cycles: 75Cycles(paired-end) - Sample Num: 18 - Ensemble References Used: Arabidopsis_thaliana.TAIR10.dna.toplevel.fa Arabidopsis_thaliana.TAIR10.45.gtf

new tuxedo stringtie RNA-Seq Arabidopsis Ensembl • 901 views
ADD COMMENT
3
Entering edit mode
5.0 years ago

Yes, I would say that is according to expectations (70% "known" genes is about the point we are at in arabidopsis indeed)

ADD COMMENT
1
Entering edit mode

Great. Thank you for replying.

ADD REPLY

Login before adding your answer.

Traffic: 2062 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6