What is the difference between merged.gtf (from cuffmerge) and combined.gtf from (cuffcompare) ? does the output of cuffdiff vary for above scenarios?
What is the difference between merged.gtf (from cuffmerge) and combined.gtf from (cuffcompare) ? does the output of cuffdiff vary for above scenarios?
In cuffmerge, your gtf annotations are actually converted to .SAM and then assembled together with cufflinks to output a merged gtf annotations. Cuffmerge will make no assumptions about whether transcripts from separate assemblies are actually the same transcript or not.
In cuffcompare, the software will try to guess if transcripts from different annotation files are the same transcript so it can make a expression comparison. It does so by looking at coordinates the intron order (not much more detail is given on how it does it). Then it reports what it thinks is the same transcript among all the compared files in the combined.gtf as only one transcript. So combined.gtf should only contain transcripts that cuffcompare guessed to be present in all the input files.
I'm pretty sure that cufflinks developers intended to replicate the SQL terms JOIN and UNION when building up cuffmerge and cuffcompare respectively. from the cufflinks manual, regarding merging "cuffmerge produces a GTF file that contains an assembly that merges together the input assemblies", and regarding comparing "Cuffcompare reports a GTF file containing the "union" of all transfrags in each sample. If a transfrag is present in both samples, it is thus reported once in the combined gtf". so cuffmerge should be use just to JOIN cufflinks gtf output files' entries without filtering them, and cuffcompare should be used to get the UNION of those entries by removing duplicates.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
What have you found out so far?