Question

visualize tiling array file to IGB

0

Entering edit mode

10.5 years ago

catherine ▴ 250

Hi everyone,

I want to compare my chip-seq data and GEO tiling array data in IGB. This is my first time dealing with tiling array, sorry if

I'm asking simple questions.

I download a SOFT formatted file, which is a GSE file that contains a GPL file(I know this is its annotation file) and 10 GSM files(I only need 7 of them). Based on my understanding, I need to load(and combine?) GPL file and GSM file, then convert(?) them into IGB readable file (such as bar file) in order to load into IGB.

Could anyone tell me how to do it?

P.S. I loaded this SOFT file by using GEOquery in R, but I dont how to do next....

Thank you very much

microarray genome igb • 2.6k views

ADD COMMENT • link updated 3.3 years ago by Ram 45k • written 10.5 years ago by catherine ▴ 250

Ram · Accepted Answer · 2014-11-06

Hi,

First stab at answering your question is:

To start, you'd need to find a way to map the tiling array probes onto the same version of the genome you aligned your ChIP-Seq data to.

From there, you'd need to then convert probe intensity values to a "wig" or "bedgraph" file.

I would pick "bedgraph," because you can sort and index it using sort, bgzip, and tabix - IGB can load display tabix'd files very efficiently. Also, you'll be able to see the full extent of the probe. (Bar ... I think? ... assumes probes are one base in length.)

The format for your bedgraph file could look something like this:

tab-delimited:

column 1: chromosome name

column 2: start position of probe (interbase coordinates)

column 3: end position of probe (probe length + start position)

column 4: probe intensity value

Some history: IGB was originally developed at Affymetrix to handle tiling array data because our previous generation genome browser - Neomorphic Annotation Station - couldn't handle such large data sets. For us, tiling array data was the original "big data" in molecular biology. We developed the "bar" format to enable partial data loading into IGB. If I recall correctly, bar stands for "binary array" - it's a binary format that stores data in one file per chromosome, with some meta-data at the top of the file.

I wrote some python code many years ago that could make "bar" files. I will try to find it as it might be useful. But these days, I would try to work with bedgraph files because they are more standard.

More to come...

-Ann