Entering edit mode
5.5 years ago
Cece
▴
30
Hi All,
I have a counts matrix from rsubread/ featureCounts which I've used in DESeq2 for differential gene expression. I've noticed that the HB genes are skewing my analysis so I'd like to remove them. From reading around, it appears that I can extract these from the counts matrix before normalization and continue with my analysis. However, I'd like to recalculate read depth for my remaining genes before I continue with my analysis. Can anyone suggest a tool to perform this in R?
Thanks in advance.
Can you define what you mean by "recalculate"? - the counts that you currently have should be raw counts. You may mean running featureCounts again with / without multi-mapping, in the assumption that the [haemo]-globin genes are 'stealing' some reads from other transcripts (?)
Yes, my counts are raw counts. Excuse me if I'm using the wrong language; I guess what I'm asking is if there is a way to summarize read depth for my counts in R after removing the HB genes. Alternatively, is the best way to do this to somehow remove the HB genes during mapping so that when I summarize counts, I'm already missing these? I want to know what impact removing these genes has on my mean read depth.
Why not just subtract whatever maps to HB from the total? What kind of summarization are you going for? With DESeq2 you can just exclude it from the DESeqDataSet.
For summarization, I want to check if my mean read depth is still a minimum of 30 million. I'm not sure how to do that from the counts matrix. Previously, I used the fastqc/ multiqc summaries to check read depth.
You'll need to decrease that to ~20 million if you start with a counts matrix, which would approximately match.
Hi Devon,
Thanks for the response but I still don't understand. I've removed the HB genes from my counts matrix but still have no idea how I go about checking what the read depth of my matrix is.
Sum each sample.
Oh, so simple. Now I feel somewhat simple! Thanks so much for clearing this up for me :)