Dealing with NAs in microbiome transcriptome count data for differential expression

0

Entering edit mode

6.2 years ago

MAPK ★ 2.1k

I am analyzing microbiome data from human gut samples and wanted to do DESeq2 analysis. I have lots of NAs in my count matrix. The reason I have NA's is because one sample may have a particular group of microorganisms that is completely or partially present in other samples. Normally, I don't get NAs while analyzing RNAseq data from a species, but for this microbiome I am getting lots of NAs. How should I deal with these NAs? Should I remove the rows with one or more NAs from the count matrix or replace NAs with zero(as they may not be part of that sample's microbiome). If I remove rows with NAs, I will be left with only 5 rows (very few loci are shared across all samples). Any suggestion would be appreciated. Thanks

RNA-Seq Deseq2 microbiome • 1.5k views

ADD COMMENT • link 6.2 years ago by MAPK ★ 2.1k

0

Entering edit mode

What do you mean by NAs? Isn't this count data? Shouldn't be zeros from the beginning? If something is not present in a sample, when you map and count you should get zero counts, not NAs.

ADD REPLY • link 6.2 years ago by h.mon 35k

0

Entering edit mode

I mapped them to Trinity assembled contigs from all metatranscriptome data, so each sample is mapped differently to the assembled contigs. Meaning one sample has genes that are not present in other samples.

ADD REPLY • link 6.2 years ago by MAPK ★ 2.1k

Login before adding your answer.