Question

Reading Cgh Arrays Data Into R

0

Entering edit mode

13.0 years ago

Assa Yeroslaviz ★ 1.9k

Hi,

We have a set of text files from the agilent feature extraction software of a CGH array experiment. In the experiment we used Agilent 4x44 CGH arrays followed by a two-colored hybridization protocol. These are tab-delimited files. They contain some input parameters as well as statistical calculations in the top two parts (separated by '*'.

The third part is the interesting one for me as it contains the results of my analysis. The problem is I am not sure what column I need for downstream analysis. We would like to identify region of loss and gain of chromosomal amount in a comparison between control and treated probes.

I appended the three top rows from the result part of the file at the bottom of this post, as they are long and will distract here.

I would like to work on it on R. Is there a package to read this kind of files into R. I would also like to know about different methods of analyzing such data (normalization, differential expression, annotation etc.)

I would be happy for ant suggestions or ideas on how to analyze such data

Thanks A.

TYPE    integer    integer    integer    text    text    text    text    text    text    text    text    text    integer    text    integer    text    integer    integer    text    text    text    text    float    float    float    float    float    float    float    boolean    boolean    float    float    float    float    integer    integer    integer    integer    integer    integer    float    float    float    float    float    float    float    float    integer    integer    float    float    float    float    float    float    float    float    integer    integer    boolean    boolean    float    float    boolean    boolean    boolean    boolean    boolean    boolean    boolean    boolean    boolean    float    float    float    float    float    boolean    boolean    float    float    integer    integer    boolean    boolean    float    float    float    float    boolean    float    float    float    float    float    integer    float    boolean    boolean    float    float    float    float    float    float    float    float    float    float    float    float    boolean    float    float    boolean    boolean    boolean    boolean
FEATURES    FeatureNum    Row    Col    accessions    probe_mappings    tm    NumPMHits    IsHomFiltered    GCPercent    PerformanceScore    CpGCnt100    CpGCnt200    SubTypeMask    SubTypeName    Start    Sequence    ProbeUID    ControlType    ProbeName    GeneName    SystematicName    Description    PositionX    PositionY    LogRatio    LogRatioError    PValueLogRatio    gSurrogateUsed    rSurrogateUsed    gIsFound    rIsFound    gProcessedSignal    rProcessedSignal    gProcessedSigError    rProcessedSigError    gNumPixOLHi    rNumPixOLHi    gNumPixOLLo    rNumPixOLLo    gNumPix    rNumPix    gMeanSignal    rMeanSignal    gMedianSignal    rMedianSignal    gPixSDev    rPixSDev    gPixNormIQR    rPixNormIQR    gBGNumPix    rBGNumPix    gBGMeanSignal    rBGMeanSignal    gBGMedianSignal    rBGMedianSignal    gBGPixSDev    rBGPixSDev    gBGPixNormIQR    rBGPixNormIQR    gNumSatPix    rNumSatPix    gIsSaturated    rIsSaturated    PixCorrelation    BGPixCorrelation    gIsFeatNonUnifOL    rIsFeatNonUnifOL    gIsBGNonUnifOL    rIsBGNonUnifOL    gIsFeatPopnOL    rIsFeatPopnOL    gIsBGPopnOL    rIsBGPopnOL    IsManualFlag    gBGSubSignal    rBGSubSignal    gBGSubSigError    rBGSubSigError    BGSubSigCorrelation    gIsPosAndSignif    rIsPosAndSignif    gPValFeatEqBG    rPValFeatEqBG    gNumBGUsed    rNumBGUsed    gIsWellAboveBG    rIsWellAboveBG    gBGUsed    rBGUsed    gBGSDUsed    rBGSDUsed    IsNormalization    gDyeNormSignal    rDyeNormSignal    gDyeNormError    rDyeNormError    DyeNormCorrelation    ErrorModel    xDev    gSpatialDetrendIsInFilteredSet    rSpatialDetrendIsInFilteredSet    gSpatialDetrendSurfaceValue    rSpatialDetrendSurfaceValue    SpotExtentX    SpotExtentY    gNetSignal    rNetSignal    gMultDetrendSignal    rMultDetrendSignal    gProcessedBackground    rProcessedBackground    gProcessedBkngError    rProcessedBkngError    IsUsedBGAdjust    gInterpolatedNegCtrlSub    rInterpolatedNegCtrlSub    gIsInNegCtrlRange    rIsInNegCtrlRange    gIsUsedInMD    rIsUsedInMD
 DATA    1    1    1    null                                            0        0        0    1    HsCGHBrightCorner    HsCGHBrightCorner    HsCGHBrightCorner    null    265.978    241.933    3.13E-02    6.17E-02    6.12E-01    0    0    1    1    3.80E+03    4.08E+03    3.81E+02    4.10E+02    0    0    0    3    66    66    5.03E+02    5.84E+02    505    586.5    6.69E+01    6.82E+01    6.38E+01    6.34E+01    701    701    2.28E+01    3.74E+01    22    36    7.06E+00    1.13E+01    7.41E+00    1.11E+01    0    0    0    0    0.276065    -0.00661349    0    0    0    0    0    0    1    0    0    469.282    539.161    47.109    54.1613    0    1    1    0    0    1    1    1    1    33.3845    44.4295    7.06454    11.3114    0    3797.71    4081.29    381.234    409.985    0    1    5.07E-01    0    0    33.3845    44.4295    51.7088    51.7088    495.636    576.926    0.720307    0.606607    22.8417    37.3609    7.06454    11.3114    0    470.703    533.049    0    0    0    0

r agilent • 4.0k views

ADD COMMENT • link updated 13.0 years ago by Leonor Palmeira 3.9k • written 13.0 years ago by Assa Yeroslaviz ★ 1.9k

score 3 · Answer 1 · 2012-07-30

3

Entering edit mode

12.8 years ago

Leonor Palmeira 3.9k

You can use the limma package in R. It contains all the functions you need to read your data (read.maimages, ...), transform it (normalizeWithinArrays, normalizeBetweenArrays, ...), analyse it (plotDensities, lmFit, eBayes, topTable...). It's a very complete and useful package. I've used it recently on two-color Agilent data and it worked like a charm for my complete analysis.

ADD COMMENT • link 12.8 years ago by Leonor Palmeira 3.9k

0

Entering edit mode

my data set has a common reference. Does it make sense to do a within-array normalization? I have read, that it is not such a good idea to run it.

Is it possible to run the single-channel protocol only on the green channel with this arrays?

ADD REPLY • link 12.8 years ago by Assa Yeroslaviz ★ 1.9k

0

Entering edit mode

Agilent is, per se, a two-color platform, but it can be used as a single color platform, you can find a lot of help on this in the 'limma' manual. The same goes for the normalization, it depends on your experimental design, and a lot of details and pointers to other papers can be found in the 'limma'-related manual/tutorial/papers.

ADD REPLY • link 12.8 years ago by Leonor Palmeira 3.9k

score 0 · Answer 2 · 2012-05-03

0

Entering edit mode

13.0 years ago

Neilfws 49k

There are multiple packages for array CGH data in R/Bioconductor. It sounds as though Agi4x44PreProcess is what you need to get started with reading AFE files and pre-processing.

ADD COMMENT • link 13.0 years ago by Neilfws 49k

0

Entering edit mode

Agi4x44PreProcess is only for single channel arrays (as far as I understood it). I have here the agilent two-color arrays.

Is there a way to do it?

ADD REPLY • link 12.8 years ago by Assa Yeroslaviz ★ 1.9k

0

Entering edit mode

I'm not very familiar with the platform, but the manual refers to both red and green channels.

ADD REPLY • link 12.8 years ago by Neilfws 49k