Dear all Biostars users,
I updated the question
we try to understand the way how Cuffdiff calculate the p-values from Jensen–Shannon divergence between isoforms. Therefore I attached the 'critical' part of the cuffdiff code.
As far as I understand the the code, they sample FPKM values from all FPKM values of the first condition ('PREV')
For example, we have 2 isoforms with 3 replicates each
Prev1 Prev2 Prev3 Curr1 Curr2 Curr3
Iso1 p11 p12 p13 c11 c12 c13
Iso2 p21 p22 p23 c21 c22 c23
and if I got the comment
// Draw k values, from prev's fpkm_samples to make the first half of the null, where k is the number of replicates in curr, and compute their average
correct. Then they sample k=3 times one of the columns Prev1, Prev2 or Prev3 . Is this correct ?
Further, I'm not sure that they are doing in this row :
curr_set_sample += prev_abundance.member_fpkm_samples()[next_sample_idx] / (double)curr_abundance.rg_props().size();
Afterwards, they just calculate the mean of sampled data. And out of mean they calculate the relative amount of the isoform at the total gene expression.
Thanks a lot for your help.
Github Code of JSD method of Cuffdiff [ https://github.com/cole-trapnell-lab/cufflinks/blob/753c109e31818dcf7aa8a2c8ecdac4fa43d2ab9b/src/differential.cpp#L558 ] Line 619 to 638
vector<double> null_js_samples;
static const size_t num_null_js_samples = 10000;
boost::random::mt19937 rng(random_seed);
boost::random::uniform_int_distribution<> prev_sampler(0, prev_abundance.member_fpkm_samples().size()-1);
boost::random::uniform_int_distribution<> curr_sampler(0, curr_abundance.member_fpkm_samples().size()-1);
for (size_t i = 0; i < num_null_js_samples; ++i){
// Draw k values, from prev's fpkm_samples to make the first half of the null, where k is the number of replicates in *curr*, and compute their average
Eigen::VectorXd curr_set_sample = Eigen::VectorXd::Zero(curr_abundance.abundances().size());
for (size_t k = 0; k < curr_abundance.rg_props().size(); ++k){
int next_sample_idx = prev_sampler(rng);
if (next_sample_idx >= 0 && next_sample_idx < prev_abundance.member_fpkm_samples().size())
curr_set_sample += prev_abundance.member_fpkm_samples()[next_sample_idx] / (double)curr_abundance.rg_props().size();
}
double total_curr_set_sample_fpkm = curr_set_sample.sum();
if (total_curr_set_sample_fpkm > 0.0)
curr_set_sample /= total_curr_set_sample_fpkm;