How to select values based on specific condition from the matrix in R
1
0
Entering edit mode
9.5 years ago
MAPK ★ 2.1k

Hi Guys,

I have a large matrix as shown below mymatrix. I would like to know if there is any way I can get the result in the form of list or matrix for each position with only those nucleotides that have values( i.e ones without NA's) and in decreasing order. For example, I want to get the result in these format:

In the form of matrix:

pos 161111     T(17)  C(1)
pos 99022222        G(24)      A(3)

or in the form of list

pos 161111
T                    C
17                   1

pos 99022222
G                    A
24                   3

and so forth...Thank you.

mymatrix

pos        A   C   G   T   N
1611111    NA  1   NA  17  NA
99022222   3   NA  24  NA  NA
99092333   NA  5   NA  91  NA
233232333  2   22  NA  NA  NA
R • 2.6k views
ADD COMMENT
0
Entering edit mode

How large of a matrix are we talking here? An efficient solution might be needed if it is too large. Otherwise, this problem is relatively easy and I will answer it when you reply.

ADD REPLY
0
Entering edit mode

It's a fairly large matrix. Thank you!

ADD REPLY
0
Entering edit mode

What dimensions? dim(mymatrix)

ADD REPLY
0
Entering edit mode

Right now my matrix is of 6023 by 8.

ADD REPLY
0
Entering edit mode

OK, that's not too bad. I will write a quick answer for it in a sec.

ADD REPLY
0
Entering edit mode

Thank you, I would really appreciate that!

ADD REPLY
2
Entering edit mode
9.5 years ago
Steven Lakin ★ 1.8k

This is not by any means pretty code, but it should work for you and output it in tab delimited format in a file in your output directory. You can then re-read that into R using read.table() with sep set to \t. I didn't include the decreasing order since it is late here, but with a little imagination, you could probably add it.

transformMyMatrix <- function(mymatrix, outputFile) {
        for(i in 1:nrow(mymatrix)) {
                temp <- paste(c("pos(", mymatrix[i, "pos"], ")"), collapse='')
                for(j in 2:ncol(mymatrix)) {
                        if(!is.na(mymatrix[i,j])) {
                                temp <- c(temp, paste(c(names(mymatrix)[j], "(", mymatrix[i,j], ")"), collapse=''))
                        }
                }
                write.table(t(as.matrix(temp)), file=outputFile, sep="\t", append=T, quote=F, row.names=F, col.names=F)
        }
}

Then call the function:

transformMyMatrix(mymatrix, "outputFile.txt")

For example, here is what I get with that:

mymatrix
        pos  A  C  G  T  N
1   1611111 NA  1 NA 17 NA
2  99022222  3 NA 24 NA NA
3  99092333 NA  5 NA 91 NA
4 233232333  2 22 NA NA NA
transformMyMatrix(mymatrix, outputFile="newMatrix.txt")
newMatrix <- read.table(file="newMatrix.txt", sep="\t")
newMatrix
              V1   V2    V3
1   pos(1611111) C(1) T(17)
2  pos(99022222) A(3) G(24)
3  pos(99092333) C(5) T(91)
4 pos(233232333) A(2) C(22)

Be aware that if you try to read it back into R in a data frame format, it will automatically fill in empty spots with NAs if the # of columns are uneven, so you might have to address that.

ADD COMMENT
0
Entering edit mode

Thank you so much!

ADD REPLY

Login before adding your answer.

Traffic: 2446 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6