Hello,
I am working with RNA-seq data and trying to implement my stringtie output file from "prepDE.py" for all 9 of my samples into DESeq2 to perform differential Expression on my three conditions here is how my data is set up:
cell line 1:
sample1 (control)
sample2 (knockdown)
sample3 (overexpression)
cell line 2:
sample4 (control)
sample5 (knockdown)
sample6 (overexpression)
cell line 3:
sample7 (control)
sample8 (knockdown)
sample9 (overexpression)
I have a generated "transcript_count_matrix.csv" file from prepDE.py and a merged_transcripts.gtf file from stringtie --merge for all 9 samples with FPKM values/ensembl IDs.
I also have the output for each sample from stringtie -e -B:
sample1.gtf e2t.ctab e_data.ctab i2t.ctab i_data.ctab t_data.ctab
I would like to know how can I perform Differential expression with this output from stringtie with DESeq2? I would like to compare all 3 control vs. all 3 knockdown/overexpression expression levels and have this in a format that I can use to input as a .gct file for Gene Set Enrichment Analysis.
Much like how cuffdiff works and outputs fpkm_tracking files with gene symbols and fpkm values. I would like something similar with this pipeline.
Any suggestions on how to proceed and any help would be greatly appreciated!!
Thanks so much,
Bryce