Dear All,
I have a question about the analysis of RNA-seq data. The only file that i have for my analysis is a GTF file. I want to identify differentially expressed en regulated genes. With galaxy I can run the cuffcompare tool which produces 4 output files, named:
- combined_transcripts
- refmap (is empty)
- tmap
- transcript accuracy
My question is, How can i identify differentially expressed genes, based on these output files and the GTF file.
Here is an overview of the GTF file:
Thank you! Lisanne
Hi DK, To run cuffdiff I need some BAM of SAM files.. right?
yes, you would need the SAM/BAM mapping files. There really isn't any other way unless you parse out the fpkm expresson values from the gtf file and use some other software to do differential expression. However the fkpm values are not ideal for that.
Hi DK, I have *.sam files which I generated from the bowtie. Then sorted with the same to get *.sorted.sam. After that I used this command cufflinks *.sorted.bam this takes a quite long time for ~1GB input file. after that I am expecte to get 3 OUTPUT files out of which 1 should be *.gtf Now my question is : to get differential genes quantification which command should I run ? cufflinks manual page says something like this:
cuffdiff [options]* <transcripts.gtf> <sample1_replicate1.sam[,...,sample1_replicateM]> <sample2_replicate1.sam[,...,sample2_replicateM.sam]>...="" <="" p="">
so am I supposed to give all *.sam files in this command in one go ? (I have 3 control sam files and 3 test sam files) and for remaining *.sorted.bam files, do i have to repeat the same process ? thanks in advance