Please adivse me on differential expression analysis on the STAR/Stringtie output
2
0
Entering edit mode
7.8 years ago
seta ★ 1.9k

Hi all,

I about follow the “HISAT, StringTie and Ballgown” pipeline for RNA-seq analysis, but I used STAR (instead of HISAT) for mapping reads on the genome followed by Stringtie for genome-guided assembly. As you know “Ballgown” take the FPKM value (here, from stringtie) for doing differential expression analysis. But, for using DEseq or edgeR , we need raw count. As I know, the popular program for generating raw count are HTseq and RSEM, which HTseq is designed to work at the gene level (not transcript level) and RSEM accept the mapping file generated by aligning to transcriptome not genome. Could you please let me know how I should create raw count from bam file produced by STAR for further processing by edgeR analysis at the transcript level?

Thanks

differential expression STAR stringtie count edgeR • 5.3k views
ADD COMMENT
0
Entering edit mode

String-Tie has a built-in script to address this issue <prepde.py>. It can be easy to overlook, so here's a direct link to their instructions: http://www.ccb.jhu.edu/software/stringtie/index.shtml?t=manual#deseq

ADD REPLY
2
Entering edit mode
7.7 years ago
dunhamcg ▴ 20

String-Tie has a built-in script to address this issue 'prepDE.py'. It can be easy to overlook, so here's a direct link to their instructions: http://www.ccb.jhu.edu/software/stringtie/index.shtml?t=manual#deseq

ADD COMMENT
0
Entering edit mode

Is 'prepDE.py' reliable? I haven't still come across any paper using this.

ADD REPLY
0
Entering edit mode
7.8 years ago
Sej Modha 5.3k

You can use featureCounts from the subread package to calculate raw counts from STAR alignments.

ADD COMMENT
0
Entering edit mode

Thanks, just one thing. Please kindly tell me if the featureCount give the count per both gene and transcript?

ADD REPLY
0
Entering edit mode

You can define the feature type of interest using -t parameter.

 -t <string>         Specify feature type in GTF annotation. `exon' by
                      default. Features used for read counting will be
                      extracted from annotation using the provided value.

For more info: http://bioinf.wehi.edu.au/featureCounts/

ADD REPLY
0
Entering edit mode

Hi Sej

Thank you. For making sure, the count read per transcript is needed for doing differential expression analysis at the transcript level, yes?, Based on the manual, in default, featureCount give us the count per gene (-t exon -g gene_id), so for counting per transcript I just put -t transcript -g transcript_id, yes, is it right?

ADD REPLY

Login before adding your answer.

Traffic: 2004 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6