Question

LogFC calculation in multiple comparisons

1

Entering edit mode

6.0 years ago

elb ▴ 260

Hi guys, suppose to be in the following situation:

    SampleA1  SampleA2   SampleA3   Ctrl1  Ctrl2    SampleB1  SampleB2   SampleB3     
       234       1          32        5      2          0        21       12344
       2434      134         0        2      0          0        0         0            
        1        0           0        1      1         1234     456       345             
       .................................................................................

Specifically rows are genes while columns are samples. Data are counts of an RNA seq experiment.

Suppose you want to perform the differential gene expression analysis and you want to compare Ctrl* vs Sample* condition. To do this you first of all filter the raw count matrix on (cpm>1) > n (n == number of samples you decide) using edgeR for example. Once this is done you have the data matrix I showed you. Then you apply glmQLFTest (after the design etc) and you will have logFC. Now my point is: suppose your boss don't want that you apply a more stringent filter on (cpm>1) > n how is it possible to avoid high logFC values even if the genes are poorly expressed as in line 3 for SampleA* vs Ctrl? LogFC will be "comparable" in terms of magnitude to the logFC referring to genes highly expressed versus 0 (line 1 for example). Moreover....suppose that gene is highly expressed in SampleB and you cannot remove it because otherwise you will remove this information when you compare SampleB* vs Ctrl. The logFC of SampleA vs Ctrl* will be high as the logFC of SampleB* vs Ctrl* but they refer to genes differently expressed in terms of magnitude. How to deal with this situation? I thought to treat the comparisons independently, i.e. considering different sets of genes when comparing SampleA* vs Ctrl* and SampleB* vs Ctrl* but I'm not sure it is correct.

Can anyone help me please?

rna-seq edgeR deseq • 1.6k views

ADD COMMENT • link updated 6.0 years ago by Ram 45k • written 6.0 years ago by elb ▴ 260

score 3 · Answer 1 · 2019-04-19

3

Entering edit mode

6.0 years ago

swbarnes2 14k

Well, don't just look at the fold changes, look at the p-values too! Also, see the lfcshrink function in DESeq

ADD COMMENT • link 6.0 years ago by swbarnes2 14k