[R] Removing columns from big.matrix which have only one value
0
0
Entering edit mode
6.6 years ago

I have a very large binary matrix, stored as a big.matrix to conserve memory (it is over 2 gb otherwise - 5 million columns and 100 rows).

r <- 100
c <- 10000
m4 <- matrix(sample(0:1,r*c, replace=TRUE),r,c)
m4 <- cbind(m4, 1)
m4 <- bigmemory::as.big.matrix(m4)

I need to remove every column which has only one unique value (in this case, only 0s or only 1s). Because of the number of columns, I want to be able to do this in parallel.

How can I accomplish this while keeping the data compressed as a big.matrix? I can convert it into a df and loop over the columns looking for the number of unique values, but this takes too much RAM.

Thanks!

EDIT: It is bioinformatics as each column is actually a protein subsequence. I am running fisher's exact to select important features, but before that, I must remove features that are present in all samples.

R • 1.2k views
ADD COMMENT
0
Entering edit mode

This is purely an R question. How is it bioinformatics?

ADD REPLY
0
Entering edit mode

Hello jackarnestad!

We believe that this post does not fit the main topic of this site.

Please tell us how this is related to bioinformatics and we will reopen the question.

For this reason we have closed your question. This allows us to keep the site focused on the topics that the community can help with.

If you disagree please tell us why in a reply below, we'll be happy to talk about it.

Cheers!

ADD REPLY
0
Entering edit mode

I addressed the bioinformatics aspect in my edit. Thanks!

ADD REPLY
0
Entering edit mode

Thanks for clarifying. This is indeed a question applied to bioinformatics, but R questions like this might get a quicker answer at bioconductor support or stackoverflow. But you can still be lucky that someone here can help you, so let's wait a bit before cross posting...

ADD REPLY
0
Entering edit mode

Could you include the package where big.matrix is defined in your code

ADD REPLY
0
Entering edit mode

Added it to the code, bigmemory

ADD REPLY

Login before adding your answer.

Traffic: 1923 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6