Entering edit mode
4.7 years ago
Assa Yeroslaviz
★
1.9k
I know I can filter my counts matrix using this command
filtered.counts <- counts[rowSums(counts==0)<3, ]
when I would like to keep genes with counts in more than three samples.
But is there a way to do the same and removes rows from the matrix when this three 0
are in only one condition?
I have 2 conditions with each four replica. I would like to filter for genes with counts in at least two of them.
Would this kind of filtering make sense? Or do I create a bias in the expression matrix?
thanks Assa
You could simply use something like
FilterByExpr
fromedgeR
.I would keep the rows (genes) if one condition has all zeros while the rest having non-zero values. Depending on the sequencing depth across different samples/conditions, this gene might simply be under-/over-represented in one condition vs others. And yes, sample-specific filtering might result in biases in the downstream steps.