Hello. Since I'm not from an English-speaking country, please excuse my lack of proficiency in English.
I'm currently conducting RNA-seq analysis. I used kallisto for mapping and quantification, and obtained counts. The experiment was performed with biological triplicates for both wild-type and mutant samples. I normalized using DESeq2 and obtained a list of differentially expressed genes. However, some of the differentially expressed genes have raw counts with differences of over 200-fold between biological replicates (probably due to issues with environmental control during sampling). In fact, when I created a PCA plot, the grouping was not clear. When facing such an issue, could additional normalization beyond DESeq2 help solve the problem? I have TPM values available. And I know that DESeq2 requires raw counts as input. However, even after DESeq normalization, the raw counts still differ by over 200-fold. So, would running DESeq using TPM values help solve my problem? But TPM values also show differences of over 200.....Since DEG analysis relies on raw counts, I feel that TPM is not useful.What can TPM be used for? Or would it be better for me to just exclude the WT3 data and reanalyze?
Here is a picture showing some of the following.(TPM)
Since you have mutant vs wild-type, using DESeq2 is the right thing to do. Rather than showing a small table of numbers, you should make a volcano plot and MA plot so you can visualize the fold changes; this is a good way of determining if everything looks reasonable. Not sure why you care so much about the counts; it's the fold change and p-values (both returned by DESeq2) that you should be interested in.
And no, don't use TPM values. TPM values are only good when comparing genes within a single sample. You are comparing between six different samples, so clearly TPM does not apply.
Have you run FASTQC and checked the mapping efficiency of WT3?
No, DESeq should only be run using raw counts.
TPM should only be used for comparison between genes in the same sample. For example, from your data, you can use TPM to say that gene1 has four times the expression of gene 2 in WT3. But you cannot compare the TPM values of gene1 between samples WT2 and WT3. HBC Training published a guide on normalization that covers the various units quite well.