filtering most variable genes
1
0
Entering edit mode
3.0 years ago
synat.keam ▴ 100

Dear All,

I am trying to filter most variable genes for my specific analysis. I have normalized count from DEseq2 attached here with row for genes and column for sample ID.

enter image description here

I have found code chunk in Biostar with the following

**data$variance= apply(data, 1, var)

data2 = data[data$variance >= quantile(data$variance, c(.50)),] #50% most variable genes

data2$variance <- NULL

summary(data2)**

These code chunk worked perfectly if I excluded the gene column from the dataframe during filtering. However, the new could matrix after filtering have no geneID as I excluded gene column during filtering. Once I added the gene column back for filtering, it won't work and it might be because gene column is factor.

Any member knows how I could modifiy this code chunk so that I could retain the geneID column in the new count matrix after filtering. Or if anyone could suggest any other better way and could provide code chunk, I will highly appreciate. I understand this is code problem, but I could not make it to work for the whole day! and sorry for many posts recently as I'm still learning RNAseq data analysis.

Looking forward to responses soon and thank in advance for your help.

Kind Regards,

synat

RNAseq • 2.0k views
ADD COMMENT
0
Entering edit mode

Are your gene names not the rownames of data?

ADD REPLY
4
Entering edit mode
3.0 years ago
Papyrus ★ 3.0k

If you really want to keep the first column with the geneIDs, to follow with your code strategy, you can do:

vars <- apply(data[,-1], 1, var)
data2 <- data[vars >= quantile(vars,0.5),]

The trick is to omit the first column, which is not numerical, for the computation of the row-wise variances (doing this: data[,-1]).

ADD COMMENT
0
Entering edit mode

Dear Papyrus,

Your code worked perfectly. Thank for your big help and prompt response. You are legend !!

Kind Regards,

Synat

ADD REPLY

Login before adding your answer.

Traffic: 1737 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6