Hi,
I am currently trying to do a differential expression analysis experiment using the tuxedo protocol (bowtie, tophat, and cufflinks) to quantify the difference in gene expression between an organism that has been treated with bacteria and one that has not. I have 10 biological replicates for each state and every sample has a technical replicate. In total I have 40 Illumina reads (20 bacteria supplemented and 20 bacteria free). I have mapped each of these reads successfully using tophat and assembled each read individually using cufflinks, but I am having trouble understand where to move forward from this point. I know that I would need to compare the difference in gene expression between biological replicates to the difference in gene expression between bacteria supplemented and bacteria free organisms, but how exactly would this be done? Should I use cuffmerge to combined all of the bacteria free assemblies and all of the bacteria supplemented assemblies and then compare them using cuffdiff? How would this handle technical replicates? Any help or guidance would be greatly appreciated
Thanks,
Evan
You should have merged technical replicates before running cufflinks. Run cuffmerge on all transcripts.gtf files generated by cufflinks and use cuffdiff with merged.gtf.
If your organism of interest has standard gtf file, use htseq-count and deseq/edger pipeline.
How would you merge the tophat alignments of technical replicates?
Also, I have a gff file for my organism, but it is still being worked on. The genome has not been completely annotated yet. When I view my alignment in IGV with the gff file, there are highly expressed transcripts that do not have genes associated with them yet. Should I still htseq-count over cufflinks?