Hi there,
As far as I understand sequencing technical replicates in different flow cells/sequencing lanes has minimal variation.
I am concerned about my experimental design which has three biological replicates each for control and treated. The three biological replicates for 'control' were grouped together and sequenced on one lane, whilst the three 'treated' biological replicates were sequenced on the second lane.
So does this mean that I have batch effect and that variability between control/treatment is disregarded as there may be underlying variation due to sequencing lane? Or does the sequencing lane/pooling have little effect on variability?
Many thanks
Since it is confounded (lane=treatment group) you anyway cannot distinguish the lane from the treatment effect so you have to accept it as-is, there is no way to account for that. But I agree, lane effects are usually no factor at all, so ignoring it is probably safe. I would not expect any negative effect just based on lane.
Thank you @rpolicastro
I was along similar thoughts. Biological variation is usually greater than technical variation (i.e. sample prep, library prep, lanes etc)
My plan is to move forward with PCA analysis and hope to see separation based on control/treatment variable
I guess that spreading my samples across different lanes would increase the false discovery of differentially expressed genes between control/treated?
Sample and library prep can sometimes introduce a noticeable batch effect, and sometimes the effect can be greater than the one between conditions. Sequencing lane on the other hand tends to not have much of an effect. As ATpoint mentioned since the conditions are perfectly confounded with lane you can't actually correct for it anyway, so the discussion is more of a hypothetical at this point.
Thanks @rpolicastro and @ATpoint, you have cleared things up for me
For the next experiment try to balance samples between lanes, e.g. make a single pool of all libraries and simply sequence over two lanes, that is the same as making batches and then sequence per batch, but allows detection and correction for the unlikely case lane had an effect. That basically goes for all potential technical batch effects, one never should confound the group condition with any technical variable.