Remove HB genes and recalculate read depth

Entering edit mode

5.9 years ago

Cece ▴ 30

Hi All,

I have a counts matrix from rsubread/ featureCounts which I've used in DESeq2 for differential gene expression. I've noticed that the HB genes are skewing my analysis so I'd like to remove them. From reading around, it appears that I can extract these from the counts matrix before normalization and continue with my analysis. However, I'd like to recalculate read depth for my remaining genes before I continue with my analysis. Can anyone suggest a tool to perform this in R?

Thanks in advance.

rnaseq hemoglobin read depth • 1.7k views

ADD COMMENT • link 5.9 years ago by Cece ▴ 30

Entering edit mode

Can you define what you mean by "recalculate"? - the counts that you currently have should be raw counts. You may mean running featureCounts again with / without multi-mapping, in the assumption that the [haemo]-globin genes are 'stealing' some reads from other transcripts (?)

ADD REPLY • link 5.9 years ago by Kevin Blighe 89k

Entering edit mode

Yes, my counts are raw counts. Excuse me if I'm using the wrong language; I guess what I'm asking is if there is a way to summarize read depth for my counts in R after removing the HB genes. Alternatively, is the best way to do this to somehow remove the HB genes during mapping so that when I summarize counts, I'm already missing these? I want to know what impact removing these genes has on my mean read depth.

ADD REPLY • link 5.9 years ago by Cece ▴ 30

Entering edit mode

Why not just subtract whatever maps to HB from the total? What kind of summarization are you going for? With DESeq2 you can just exclude it from the DESeqDataSet.

ADD REPLY • link 5.9 years ago by Devon Ryan 105k

Entering edit mode

For summarization, I want to check if my mean read depth is still a minimum of 30 million. I'm not sure how to do that from the counts matrix. Previously, I used the fastqc/ multiqc summaries to check read depth.

ADD REPLY • link 5.9 years ago by Cece ▴ 30

Entering edit mode

You'll need to decrease that to ~20 million if you start with a counts matrix, which would approximately match.

ADD REPLY • link 5.9 years ago by Devon Ryan 105k

Entering edit mode

Hi Devon,

Thanks for the response but I still don't understand. I've removed the HB genes from my counts matrix but still have no idea how I go about checking what the read depth of my matrix is.

ADD REPLY • link 5.9 years ago by Cece ▴ 30

Entering edit mode

Sum each sample.

ADD REPLY • link 5.9 years ago by Devon Ryan 105k

Entering edit mode

Oh, so simple. Now I feel somewhat simple! Thanks so much for clearing this up for me :)

ADD REPLY • link 5.9 years ago by Cece ▴ 30