Fold change UP and Down in dplyr calculation
0
0
Entering edit mode
7.3 years ago
1769mkc ★ 1.2k

So

      gene          HSC         LSC                    fc
    ACTR6    0.6032459633   2.24405896           3.7199734379
    APEX1    6.0443930667   5.18402206           0.8576579985
    APOBEC3C 3.5626328667   4.61072292           1.2941897446
    APOBEC3H-2.952164       -0.398966118          0.135143616
    ARID1A  3.8456664333    2.75774774          0.7171052892
    ARID1B     3.4017756    2.33561938        0.6865883158



file1$fc <- (file1$LSC/file1$HSC)
    HSC_LSC_fold_Change <-file1 %>%
      mutate(FOLD_UP_DOWN = ifelse((fc) > 2.5, "up", "down"))

I m doing something like this am i doing it correctly ?is this the way one would calculate fold change ?

Do suggest if im doing something wrong

R • 5.4k views
ADD COMMENT
1
Entering edit mode

I don't think your fold change values are correct. Could you explain why you have negative values in your HSC and LSC columns, please? Have those values been log-transformed, and if so, shouldn't the fc column should be LSC - HSC?

Note that log(B / A) = log(B) - log(A)

ADD REPLY
0
Entering edit mode

yes these are log transformed values yes this i get it

Note that log(B / A) = log(B) - log(A)' so i just have to subtract the log transformed values ?is that what you are suggesting

ADD REPLY
1
Entering edit mode

Yes, subtract one from the other, and label the column logFC, so you remember that the data is trasnformed.

You shouldn't be treating things that "aren't upregulated" as being downregulated. I think it wold be better to split the values as follows (if you consider it appropriate to define up/down-regulation based on fold-change alone), if you genuinely do want to split at ~ 2^2.5 ~ 5-fold as your cut-off

cut(logFC, breaks = c(-Inf, -2.5, 2.5, Inf), include.lowest = TRUE, labels = c("DOWN", "NC", "UP"))

ADD REPLY
0
Entering edit mode

where do i include your code in my code

cut(logFC, breaks = c(-Inf, -2.5, 2.5, Inf), include.lowest = TRUE, labels = c("DOWN", "NC", "UP"))

can you tell me ...?

ADD REPLY
1
Entering edit mode

Within mutate:

cutter <- function(x) cut(x, breaks = c(-Inf, -2.5, 2.5, Inf),
                                        include.lowest = TRUE,
                                        labels = c("DOWN", "NC", "UP"))
file1 %>%
     mutate(logFC = LSC - HSC) %>%
    mutate(FOLD_UP_DOWN = cutter(logFC))
ADD REPLY
0
Entering edit mode

thank you for the code since i have pretty basic use of R i dont know how these simple yet effective things to do...but thank you again

ADD REPLY
0
Entering edit mode
 c(-Inf, -2.5, 2.5, Inf),
                                        include.lowest = TRUE,
                                        labels = c("DOWN", "NC", "UP"))

can you explain ..this so if i get -2.5 then down and 2.5 .....but when do i get NC?

ADD REPLY
1
Entering edit mode

Run some experiments:

cutter(c(-5:5, -2.501, -2.5, -2.499, 2.499, 2.5, 2.5001))

Read the helppage for cut

? cut
ADD REPLY
0
Entering edit mode

please post example data here including fc groups

ADD REPLY
0
Entering edit mode

edited my post please have a look at it again

ADD REPLY
0
Entering edit mode

if expression values are logged, then you should be simply subtracting between the groups. Your FC column is incorrect, considering the values for LSC and HSC. Since your comparison is between LSC and HSC (in that order), subtract HSC from LSC (assuming that expression values are in log2). That should give you the FC.

ADD REPLY
1
Entering edit mode

FC cutoff used in your code is 2.5 and 2^2.5 is 5.656854. Is that what you are looking for? Your code is modified as below:

library(dplyr)
test=read.csv("test.exp", sep="", stringsAsFactors = F) 
test$fc <- (test$LSC-test$HSC)
HSC_LSC_fold_Change <- test %>% mutate(FOLD_UP_DOWN = ifelse((fc) > 2.5, "up", "down"))

output:

> HSC_LSC_fold_Change
      gene       HSC        LSC        fc FOLD_UP_DOWN
1    ACTR6  0.603246  2.2440590  1.640813         down
2    APEX1  6.044393  5.1840221 -0.860371         down
3 APOBEC3C  3.562633  4.6107229  1.048090         down
4 APOBEC3H -2.952164 -0.3989661  2.553198           up
5   ARID1A  3.845666  2.7577477 -1.087919         down
6   ARID1B  3.401776  2.3356194 -1.066156         down
ADD REPLY
1
Entering edit mode

From the above table, let us say any gene with above 1.08 fold change (log2) is up regulated, any gene below 1.08 fold change (log2) is down regulated and any gene between 1.08 and -1.08 is NC, then the code would be:

$ test %>% mutate(reg = ifelse((fc) > 1.08, "up", ifelse(abs(fc)<1.08, "NC", "down")))

output:

      gene       HSC        LSC        fc  reg
1    ACTR6  0.603246  2.2440590  1.640813   up
2    APEX1  6.044393  5.1840221 -0.860371   NC
3 APOBEC3C  3.562633  4.6107229  1.048090   NC
4 APOBEC3H -2.952164 -0.3989661  2.553198   up
5   ARID1A  3.845666  2.7577477 -1.087919 down
6   ARID1B  3.401776  2.3356194 -1.066156   NC

NC= did not change, up=upregulated and down=downregulated. FC CO 1.08 is arbitrary.

ADD REPLY
0
Entering edit mode

thankyou for making it clear

ADD REPLY
0
Entering edit mode

i understand your the complete picture so to make it clear i m taking HSC as my control and LSC those are disease cell so that would be my test ..so i have to subtract LSC from HSC isnt it?

ADD REPLY
1
Entering edit mode

you'd usually subtract the control from the non-control value (ie, LSC - HSC)

ADD REPLY
0
Entering edit mode

Yes, from your description. Did you do statistical test before you chose genes of interest?

ADD REPLY
0
Entering edit mode

these are the differentlially expressed genes I got this data from SeqMonk ...it did a deseq2 analysis im using that data

ADD REPLY

Login before adding your answer.

Traffic: 1904 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6