Remove inf and NA from data frame
2
1
Entering edit mode
7.9 years ago
1769mkc ★ 1.2k

I have a dataframe which contains value of log2fold change but it contains inf and NA values i searched all over stack-exchange tried their solution but most of them seems not working few of them works but its not giving me the desired output ,need help it should't be that difficult i suppose .

my data frame

GENE_NAME `HSC_VS_CMP` `HSC_VS_GMP``HSC_VS_Monocytes`

  ACTL6A  -0.20084399  0.297430 -0.350876000
  ACTR8    -0.2925280   -0.158551  1.10747
  AICDA      inf         NA           inf
   ANP32B  -0.6549    -0.615725    -0.35858

I get error like this " default method not implemented for type 'list' "

So please suggest me how to remove all the inf and NA from the data frame

R • 37k views
ADD COMMENT
2
Entering edit mode

What have you tried (with code)? Presuming you just want to remove the rows, it's a simple (A) apply(d, 1, function(x) anyis.na(x) || is.infinite(x))) and then (B) subsetting accordingly.

ADD REPLY
0
Entering edit mode

Alternative without using apply:

x[rowSumsis.na(x) | is.infinite(x)) == 0, ]

Note: either way, you may need to take special care of the GENE_NAME column. I suspect your are having trouble with that in your attempts so far (but as mentioned by Devon, please show us the code).

ADD REPLY
1
Entering edit mode

You can't is.infinite() dataframes, which is probably what resulted in the originally reported error.

ADD REPLY
1
Entering edit mode

You are right! (didn't do enough testing...) Find this a bit inconsistent since is.na() works fine. Further searching points to this SO post where a viable solution is to implement is.infinite.data.frame method, e.g.:

is.finite.data.frame <- function(obj){
    sapply(obj,FUN = function(x) all(is.finite(x)))
}
ADD REPLY
0
Entering edit mode

what is the way to do it ?the way you suggested ,I know there are multiple ways but since Im learning and the at the same time I have to use them in the data sets so i have to just read and see which one is working.

So can you show me how do i remove NA ,inf and 0 if any from my data frame in a concise code

ADD REPLY
0
Entering edit mode

I can take out the GENE_NAME column and apply the same

ADD REPLY
0
Entering edit mode

I tried this

na.zero <- function (x) {
x[is.na(x)] <- 0
 return(x)

}

ADD REPLY
1
Entering edit mode

What is the desired output?

ADD REPLY
0
Entering edit mode

well I want to replace inf and NA with 0 .

ADD REPLY
3
Entering edit mode
7.9 years ago
ddiez ★ 2.0k

Alternative (borrowed from this SO answer):

# test data
d <- mtcars

# add NAs and Inf.
d[1,1] <- NA
d[2,2] <- NA
d[2,1] <- Inf
d[1,2] <- Inf

head(d)
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4           NA Inf  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag      Inf  NA  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

# the magic:
d <- do.call(data.frame, lapply(d, function(x) {
  replace(x, is.infinite(x) | is.na(x), 0)
  })
)

# note that you lose the row names.
head(d)
   mpg cyl disp  hp drat    wt  qsec vs am gear carb
1  0.0   0  160 110 3.90 2.620 16.46  0  1    4    4
2  0.0   0  160 110 3.90 2.875 17.02  0  1    4    4
3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
4 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
5 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
6 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
ADD COMMENT
0
Entering edit mode

How do I keep the row-names ? or I can simple make a subset out of the original data set and cbind.data.frame().

ADD REPLY
0
Entering edit mode

That is inconvenient, isn't it? I would just assign the modified data.frame to a different variable (in the do.call part) and then copy the row names from the original data to the modified one.

ADD REPLY
0
Entering edit mode

okay ,thats a new concept to me , i will try and let know if I am able to do it.

ADD REPLY
0
Entering edit mode

You have half the answer in Devon's answer to your question. He assigns the result from lapply() to d2 instead of d. Also take a look at ?rownames.

ADD REPLY
0
Entering edit mode

To create a pipelinable rowname setter, you could replicate the setNames function setRownames <- function(x, rn){ row.names(x) <- rn x }

Then do d <- do.call(blah blah blah) %>% setRownames(d)

ADD REPLY
3
Entering edit mode
7.9 years ago

is.na() works on dataframes. For the Infs, lapply() a function:

d2 = lapply(d, function(x) {
    if(any(is.infinite(x))) {
         x[is.infinite(x)] = 0
    }
    return(x)
})
d2 = as.data.frame(d2)
ADD COMMENT
0
Entering edit mode

I still see inf in my data frame

ADD REPLY
1
Entering edit mode

When I do it I don't see Inf, so show a reproducible example.

ADD REPLY
0
Entering edit mode

This works for me, no problem (for example, using the dataset in my answer). Note, in the example data you included infinite is specified as "inf" not Inf (R's way). So maybe it is related?

ADD REPLY
0
Entering edit mode

OK, even if in your original data you had "inf" instead of "Inf" it seems read.table (and probably friends) ignore case (see ?read.table). So it is read properly and shouldn't be a problem (e.g. read.table(text = "inf 0 10").

ADD REPLY
0
Entering edit mode

Some guesswork, I would also check for: "inf" and "NA" as character, assign NA, then use complete.cases()

ADD REPLY

Login before adding your answer.

Traffic: 2207 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6