Large Data Set Analysis With Cuffdiff
1
0
Entering edit mode
10.9 years ago
newDNASeqer ▴ 790

Hi,

I have 28 samples for analysis with cuffdiff, since cuffdiff performs permutations for all the samples, I am afraid 28 samples will take a long time to finish the permutations. I was wondering if I can split the 28 samples into smaller data set, each of which will be processed by cuffdiff and the results will be merged. I am not sure how to do this. Do I need to use a common sample in each sub set of cuffdiff analysis?

cuffdiff rnaseq • 2.7k views
ADD COMMENT
1
Entering edit mode
10.9 years ago
Johan ▴ 890

Does each sample represent a unique condition? If they are really representatives of different groups you should use the: --labels flag and compare the conditions rather than compare all samples vs. all other samples.

Your command line should end up looking something like this:

cuffdiff --labels CondA,CondB [all other options]  <transcripts.gtf> sample1_from_condA.bam, sample2_from_condA.bam sample3_from_condB.bam, sample4_from_condB.bam

This should reduce the number of comparisons that has to be performed, which should also reduce the run time. As the number of comparisons grows quadratically with the number of conditions to be compared it pays off to construct your study in a reasonable way with respect to this.

ADD COMMENT
0
Entering edit mode

Hi Johan,

I am doing similar kind of analysis.Does the transcript.gtf file consists of all the transcript gtfs from both the conditions(which can be accomplished by cuffmerge)?

Or I can use my annotation.gtf file instead there and compare the samples from two conditions?

ADD REPLY

Login before adding your answer.

Traffic: 2009 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6