Hi everyone, I am new to RNA-seq analysis and I'm trying to understand what to do with my data.
I have 60 biological samples (30 control and 30 diseased, paired samples). Each biological sample has 1 or two technical replicates. I will concatenate the .fastqc files for technical replicates to yield the biological replicate file. My issue is that biological replicates with only 1 technical replicate will have proportionally less reads than biological samples with 2 technical replicates. Is this correct? Is there anyway to overcome this? Is this an acceptable method?
Alternatively, I can work with the technical replicates and then average the reads for biological samples that have 2 technical replicates, so that proportionally biological replicates with 1 technical replicate are proportional to biological replicates with 2 technical replicates.
Either method (concatenating or averaging) when it comes to differential expression will compare relative expression values.
Reading the below Biostars discussion was useful: Technical replicates in RNAseq
Cheers
I think maybe concatenate technical replicates and take this action as one batch group? For example with replicates is 1 without replicate is 0.
Thank you. this was confusing me
if DEG is the goal of this analysis, than one of the first steps in this process is correction for library size, so whether one sample has two or no or more technical repeats is not an issue. Much more critical is that you have biological replicates for each sample!
Thank you, I'll go ahead and concatenate the files.