Hi there. I used HISAT2 for the alingment, Stringtie for the assembly and the R package Ballgown for the Differential Expression (DE) analysis (protocol published here: http://www.nature.com/nprot/journal/v11/n9/full/nprot.2016.095.html ). I used following commands.
data_directory = system.file('extdata', package = 'ballgown')
bg_chrX = ballgown(dataDir = data_directory, samplePattern = "G", meas = 'all', pData = pheno_data)
bg_chrX_filt = subset(bg_chrX,"rowVars(texpr(bg_chrX)) >1",genomesubset=TRUE)
bg_chrX_filt
results_transcripts = stattest(bg_chrX_filt,
feature="transcript",covariate="variety", getFC=TRUE,
meas="FPKM", timecourse = TRUE)
results_genes = stattest(bg_chrX_filt, feature="gene",
covariate="variety", getFC=TRUE,
meas="FPKM", timecourse = TRUE)
when i used to fetch my gene list it appears as follows :-
> (results_genes)
feature id fc pval qval
1 gene MSTRG.28632 0.34122015 1.947303e-05 0.1767612
2 gene MSTRG.3615 5.15572720 2.210357e-05 0.1767612
3 gene MSTRG.7507 0.25190672 2.219452e-05 0.1767612
4 gene MSTRG.70532 0.31864709 2.421819e-05 0.1767612
5 gene MSTRG.49954 0.30219146 2.569801e-05 0.1767612
this list continues with the same MSTRG gene id upto 200 rows. i need gene ID's to do my Gene Ontology step. I'll be really grateful if someone can suggest me a suitable command to do my work.
It's really helpful. Thanks !