Run Methylkit Analysis Sequentially On Several Comparisons And Contexts
1
1
Entering edit mode
11.6 years ago
essell22 ▴ 10

I'm realatively new to R (having only used it interactively in R studio before) and have never written an R script before. I'm using the MethylKit package to analyze many files and would like to write a script that would run through my comparisons sequentially. My current script looks like:

library("methylKit")
getFilenames <- function(FileNames) {
  File.list <- list()
  x <- readline("Filename: ")
  while (x != "") {
    File.list <- append(File.list, x)
    x <- readline("Filename: ")
  }
  File.list
}
getFilenames(FileNames)
Context <- readline("Context: ")
SampleIDs <- readline("List Sample IDs separated by a space: ")
SampleIDs <- lapply(SampleIDs, list, sep = " ")
Treatment <- readline("enter treatment types (0 or 1) separated by commas: ")

myobj <- read(File.list, sample.id=SampleIDs, assembly="mm9", treatment = c(Treatment), context= Context)
filt <- filterByCoverage(myobj, lo.count=20, lo.perc=NULL, hi.count=NULL, hi.perc=99.9)
meth <- unite(filt)
rm(Files)
rm(File.list)
rm(myobj)
rm(filt)
gc(TRUE)
diff <- calculateDiffMeth(meth, num.cores = 8)
diff20.01 <- get.methylDiff(diff, difference =20, qvalue=0.01)
MainFileName <- readline("MainFileName: ")
write.table(data.frame(meth), file=MainFileName+"_"+Context+".methylbase.txt", sep="\t", quote=FALSE)
write.table(data.frame(diff), file=MainFileName+"_"+Context+".methyldiff.txt", sep="\t", quote=FALSE)
write.table(data.frame(diff20.01), file=MainFileName+"_"+Context+"_20.01.methyldiff.txt", sep="\t", quote=FALSE)

This would (hopefully) run through one comparison using a single context. For each comparison I have 3 different contexts (CpG, CHG and CHH) coming from different input files. However, I would like to enter all the information for multiple comparisons and contexts that will then run one after another without my having to enter all the information for the next run after the previous run has finished.

r • 4.1k views
ADD COMMENT
0
Entering edit mode
11.3 years ago

I'm not sure how many files you have to read, but the analysis structure is defined by "myobj". In other words, you only have to create your file list once, and then you set up "myobj" for the relevant contrasts.

You may have already seen the relevant part of the code, but this is all it takes to set up your files:

file.list=list(  "test1.myCpG.txt",  "test2.myCpG.txt", "control1.myCpG.txt",  "control2.myCpG.txt" )

If you have different sets of treatments and controls among the same set of samples, you only have to do this once. If you need to look at different subsets, you should do this once for each subset.

It looks like your code above is problematic for at least a couple reasons:

1) It looks like you only have one value in the treatment array (you must have at least one control and one treatment)

2) I think you have to represent treatments and controls numerically (1=treatment, 0=control)

FYI, I think the example code for running methylKit are pretty good. I remember getting feedback pretty quickly when e-mailing the methylKit contact. That is probably a better source for specific technically questions.

http://code.google.com/p/methylkit/

ADD COMMENT
0
Entering edit mode

Oh, I see that you are trying to defined the treatments above myobj. However, I can't tell how this is supposed to create the desired result (and, if you already have an array defined, you don't need the additional c()).

Maybe more details about the early part of the script will be helpful?

ADD REPLY

Login before adding your answer.

Traffic: 2832 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6