ttest on a list in R
1
0
Entering edit mode
9.4 years ago
yasjas ▴ 70

Hi everyone,

I have a list (data) that I want to apply a t-test on each of the component

[[1]]
         rep_name rep_family   gene distance Hepatocytes_B1 Hepatocytes_B3 Huh.7_B1 Huh.7_B2 T_score
14732 HERVK22-int        LTR THSD7A        0         5.3682           4.74   4.5634   4.0869 0.72895

[[2]]
         rep_name rep_family    gene distance Hepatocytes_B1 Hepatocytes_B3 Huh.7_B1 Huh.7_B2 T_score
2565  HERVL40-int        LTR  ANKIB1     3238         7.2268         7.3056   7.2132   7.5750 -0.1279
2646  HERVL40-int        LTR  ANKIB1     2879         7.2268         7.3056   7.2132   7.5750 -0.1279
2673  HERVL40-int        LTR  ANKIB1     2355         7.2268         7.3056   7.2132   7.5750 -0.1279
2693  HERVL40-int        LTR  ANKIB1     2051         7.2268         7.3056   7.2132   7.5750 -0.1279
16782 HERVL40-int        LTR PRKAR2B        0         6.4382         2.2347   7.6774   6.6859 -2.8452

[[3]]
      rep_name rep_family   gene distance Hepatocytes_B1 Hepatocytes_B3 Huh.7_B1 Huh.7_B2  T_score
990  L1PA15-16       LINE    CFH        0        11.3796        13.2806   6.9655   7.1840  5.25535
1035 L1PA15-16       LINE    CFH        0        11.3796        13.2806   6.9655   7.1840  5.25535
7078 L1PA15-16       LINE KLHL13   152805         5.3824         4.7007   7.0668   6.8522 -1.91795

[[4]]
      rep_name rep_family     gene distance Hepatocytes_B1 Hepatocytes_B3 Huh.7_B1 Huh.7_B2  T_score
11062    LTR13        LTR SLC22A16        0         1.8176         2.4619   3.0775   2.0936 -0.44580
14322    LTR13        LTR   POLR2J      929         7.6368         8.2278   9.5044   9.8754 -1.75760
22284    LTR13        LTR   CACNG3    11594         2.6007         2.4943   1.8240   1.2075  1.03175

[[5]]
      rep_name rep_family   gene distance Hepatocytes_B1 Hepatocytes_B3 Huh.7_B1 Huh.7_B2 T_score
9719    LTR22A        LTR   CD38        0         5.8011         6.9873   5.7066   2.5126 2.28460
15293   LTR22A        LTR THSD7A        0         5.3682         4.7400   4.5634   4.0869 0.72895

the thing that I tried was:

ttest <- for(i in 1:length(data)){
    var1 <- data[[i]][5:6]
    var2 <- data[[i]][7:8]
    t.test(var1,var2)
}

but it gives me an error.However when I try for example to run the test bit by bit(on the second element in the list)

var1 <- data[[2]][5:6]
var2 <- data[[2]][7:8]
t.test(var1,var2)

It works perfectly, does someone has an idea where I m getting it wrong?

Thanks

R • 5.6k views
ADD COMMENT
0
Entering edit mode

Firstly:

What error is thrown?

What is the value of i at the point when the error is thrown?

Are the input values all the same for data[[i]]?

ADD REPLY
0
Entering edit mode

it s just giving me a NULL as answer and for the input I don't know if it s logical but I thought I represents each list like list[[1]],list[[2]],.. etc.

I don't know if I am making sense

ADD REPLY
0
Entering edit mode

It is giving you NULL because for loops don't assign values to variables. You would have to use lapply() for that, or manually build your results list inside the for loop.

ADD REPLY
0
Entering edit mode

Thanks, I will try

ADD REPLY
1
Entering edit mode
9.4 years ago
alolex ▴ 960

Try using the lapply() function instead of a for loop. If it still gives an error we would need to know on which list element it gives the error (i.e. which value of i) and what the error is.

lapply(data, function(x){t.test(x[5:6], x[7:8])})
ADD COMMENT
0
Entering edit mode

Well I tried this earlier and apparently when there is some values which are constant (compared to the larger of the two means) an error is generated,so it doesn't work either :-(

ADD REPLY
1
Entering edit mode

That error is correct to appear, if the values are identical then the variance is 0, and the t-statistics cannot be calculated. These tests need to be skipped, because the variance estimate is bad.

ADD REPLY
0
Entering edit mode

This is correct. I found this post on how to possibly resolve this issue, but it may not be applicable to your situation, so make sure you understand what you are trying to test before following the suggestion.

ADD REPLY
0
Entering edit mode
Error in if (stderr < 10 * .Machine$double.eps * max(abs(mx), abs(my))) stop("data are essentially constant") :
  missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In mean.default(x) : argument is not numeric or logical: returning NA
2: In mean.default(y) : argument is not numeric or logical: returning NA

That's the error but I think it s because some of the values are constant in some row

ADD REPLY

Login before adding your answer.

Traffic: 1091 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6