Can any one suggest preprocessing steps/tools needed for analysing SNP6.0 .cel files from affy for genotyping studies. May be to get integrated into BRB-array tools (CGH) or any other?
Can any one suggest preprocessing steps/tools needed for analysing SNP6.0 .cel files from affy for genotyping studies. May be to get integrated into BRB-array tools (CGH) or any other?
Use CRLMM. Accurate and dead simple to operate.
source("http://bioconductor.org/biocLite.R")
biocLite("crlmm")
biocLite("genomewidesnp6Crlmm")
library(crlmm)
library(genomewidesnp6Crlmm)
path.n = 'DIR_FOR_MY_CELS'
cel.normals = list.celfiles(path.n, full.names=TRUE)
sample.names = VECTOR.OF.SAMPLE.NAMES
crlmm.result = crlmm(cel.normals, verbose=T, sns=sample.names)
Examine the signal-to-noise ratio in crlmm.result[["SNR"]]; remove samples with SNR < 5 unless you have a good reason to keep them. This package does both genotyping and copy number estimation if you are doing tumor vs. normal.
Also consider aroma.affymetrix as a worthy alternative, though that package is a bit more work to operate because of its file structure requirements and because it writes many intermediate steps to disk. Also, it has a fundamental difference in that aroma works on one file at a time, while CRLMM uses all of the files at once and can adjust for batch effects that are known a priori.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
thank you; I will give it a try..
@David: Iam getting below Error message: "list" contains sample file names without .cel , one per line. followed above steps
Calling 906600 SNPs... Error in dimnames(x) <- dn : length of 'dimnames' [2] not equal to array extent
I don't think you are passing a vector of characters into the sns parameter that has the same number of elements as the length of cel.normals (presumably 176). Calling read.table and then as.vector won't actually work to make a vector if you're reading a text file with one line per CEL file; you'll want as.vector(cf$V1) or whatever the column name is for your table elements.
Oh i missed to place $V1, thanks for correcting. Yeah I do have 176 samples.
I am new to R; another very simple question. how to extract out the data from the crlmm.result.
Try the CRLMM manual. Type ?crlmm or ?calls after loading CRLMM to get started.
Thanks, I looked in to manual: may be I need to run with "save.it" option included. crlmm.result : SnpSet (storageMode: lockedEnvironment) assayData: 906600 features, 176 samples element names: call, callProbability protocolData: none phenoData sampleNames:ATF_198-1226 BXF_281-1236 ... UZF_1281KX-8421(176 total) varLabels:SNR gender batchQC varMetadata:labelDescription featureData featureNames: SNP_A-2131660 SNP_A-1967418... SNP_A-8574011 (906600 total) fvarLabels: SNPQC spAA spAB spBB fvarMetadata: labelDescription experimentData: use 'experimentData(object)' Annotation: genomewidesnp6Crlmm
@david : can you help me with write table the crlmm.results. I tried write table with no sucess
write.table(data.frame(crlmm.result), "Output.tsv", row.names=T, col.names=T, sep="t")
@david : can you help me with write table the crlmm.results. I need to export the data similar to ".cnt" file. I tried write table with no sucess write.table(data.frame(crlmm.result), "Output.tsv", row.names=T, col.names=T, sep="t")