I have two normal Breast cell illumina transcriptome data and Breast cancer illumina transcriptome data. I want to see the differential gene expression between normal and cancer samples. Following are the analysis i did so far. Could you please help me how to proceed with cuffidiff tool.
Normal breast cell1 - from raw data QC has been performed, remvoed duplicates using piccard, Performed tophat alignment against Human reference genome 38, tophat result 3 bed files and 1 bam file obtained
Normal breast cell2 - from raw data QC has been performed, remvoed duplicates using piccard, Performed tophat alignment against Human reference genome 38, tophat result 3 bed files and 1 bam file obtained
Breast cancer1 - from raw data QC has been performed, remvoed duplicates using piccard, Performed tophat alignment against Human reference genome 38, tophat result 3 bed files and 1 bam file obtained
Breast cancer2 -from raw data QC has been performed, remvoed duplicates using piccard, Performed tophat alignment against Human reference genome 38, tophat result 3 bed files and 1 bam file obtained
Could you please tell me which files i should use for cufflink analysis and then how to proceed with cuffdiff?
If you used the topHat for Mapping and align by my personal experience I recomend to use cufflinks more than DEseq or edgeR for diferential expresssion, if you gonna use Cuffdiff, the input are the
accepted_hits.bam
and your transcripts.gtf, this file is product of you cufflinks assembly. You gonna need to obtain a merge.gtf, into Cufflinks package you can find this function too.