Hi there. I used HISAT2 for the alingment, Stringtie for the assembly and the R package Ballgown for the Differential Expression (DE) analysis (protocol published here: http://www.nature.com/nprot/journal/v11/n9/full/nprot.2016.095.html ). Now, i want to do gene ontology by using AGRIGO which only accepts gene ids. i used following commands to obtain my file which also has gene names (required for AgriGO) but there are many MSTRG tags along with many ids starting with "TraesCS". Even if i skip the MSTRG's tags than still ids starting with "TraesCS" is an issue to look (since these are not gene ids and AgriGo tool for gene ontology requires gene ID's only) . Is there any way to get gene ID's for "TraesCS" so i can complete my gene ontology. Following are the commands.
results_transcripts = stattest(bg_chrX_filt,
feature="transcript",covariate="variety", getFC=TRUE,
meas="FPKM", timecourse = TRUE)
results_transcripts =
data.frame(geneNames=ballgown::geneNames(bg_chrX_filt),
geneIDs=ballgown::geneIDs(bg_chrX_filt), results_transcripts)
results_transcripts = arrange(results_transcripts,pval)
write.csv(results_transcripts,"transcripts_final.csv")
The Output file "transcripts_final.csv" produces data as
"","geneNames","geneIDs","feature","id","fc","pval","qval" "1",".","MSTRG.54886","transcript","137261",0.304863087619068,2.21245244347301e-11,1.01982995381888e-06 "6010",".","TraesCS3D02G499300","transcript","83445",0.500712464045665,0.0923136728887992,0.70792978190558 "31088","snoR44_J54","ENSRNA050017166","transcript","161562",0.793742805321971,0.574621010428205,0.851930483149002
The first line show MSTRG-tags, second shows id starting with TraesCS and third lines has a gene name which is the only thing i have for gene ontology.
Any help and Guidance will do a lot for me. thank you in advance.