Question

binding close to gene

0

Entering edit mode

10.6 years ago

Sheila ▴ 280

Given the Chip-seq dataset GSE46992, how to find that if the transcription factor considered in the study have binding sites close to a specific gene of interest (COX-1 in this case)?

Please suggest easy to use tool or web service !!

ChIP-Seq • 3.2k views

ADD COMMENT • link updated 2.4 years ago by Ram 44k • written 10.6 years ago by Sheila ▴ 280

Ram · Answer 1 · 2014-05-07

2

Entering edit mode

10.6 years ago

Ming Tommy Tang ★ 4.5k

I think the easiest way is to upload the peak file (bed file) to UCSC genome browser, and manually check if there are any peaks near COX-1. other tools like PAVIS, annotateGenomicRanges, homer, bedtools, bedops and R packages ChIPpeakAnno and ChIPseeker can annotate the whole peak file.

ADD COMMENT • link updated 4.9 years ago by Ram 44k • written 10.6 years ago by Ming Tommy Tang ★ 4.5k

Ram · Answer 2 · 2014-05-05

1

Entering edit mode

10.6 years ago

Devon Ryan 104k

It'd be simplest to just use the ChIPpeakAnno package in Bioconductor. Alternatively, just import everything as GRanges objects and use the nearest() function with a bit of scripting.

ADD COMMENT • link updated 4.9 years ago by Ram 44k • written 10.6 years ago by Devon Ryan 104k

0

Entering edit mode

Thanks. If possible please post R code.

ADD REPLY • link 10.6 years ago by Sheila ▴ 280

0

Entering edit mode

I'll just refer to their vignette, which has an example that seems to closely match what you want to do. Go ahead and make a new post if you run into any problems with that.

ADD REPLY • link 10.6 years ago by Devon Ryan 104k

1

Entering edit mode

see this bug, http://ygc.name/2014/01/14/bug-of-r-package-chippeakanno/

ADD REPLY • link 9.6 years ago by Guangchuang Yu ★ 2.6k

0

Entering edit mode

Interesting, thanks for pointing that out. I agree that it should take the strand of a peak (when known) into account and calculate distances accordingly. Has there been any movement on fixing this?

ADD REPLY • link 9.6 years ago by Devon Ryan 104k

1

Entering edit mode

This is also my motivation of developing ChIPseeker, https://github.com/GuangchuangYu/ChIPseeker

ADD REPLY • link 9.6 years ago by Guangchuang Yu ★ 2.6k

0

Entering edit mode

It's unfortunate that you had to develop a package to get around this, but thanks for doing so!

ADD REPLY • link 9.6 years ago by Devon Ryan 104k

0

Entering edit mode

ChIPseeker contains more features than ChIPpeakAnno. It's definitely better.

ADD REPLY • link 9.6 years ago by Guangchuang Yu ★ 2.6k

0

Entering edit mode

The author don't believe this is an issue as she replied in my blog post.

They don't fix this issue, you can refer to the supplemental file of http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btv145.

ADD REPLY • link updated 2.4 years ago by Ram 44k • written 9.6 years ago by Guangchuang Yu ★ 2.6k

Ram · Answer 3 · 2014-05-05

BEDOPS offers a tool for this called closest-features, which finds the nearest query element(s) to each of a set of reference elements. (In your use case, TF binding sites would be query elements, and your genes (say, TSSs) are your reference elements.)

It's very simple to use, and very fast, with a low memory profile. R and libraries often have a habit of loading everything into system memory, which can be a problem if you're working with large datasets.

To get your TFs ready, you can use the bedops set operation tool to filter your transcription factor set for TF binding sites that overlap ChIP-seq peaks or other regions. Take a look at the --element-of operation.

Then you might use closest-features to look for the nearest ChIP-seq-peak-overlapping-TF to each member of your set of, for example, gene transcription start sites (COX-1, etc.).