binding close to gene
3
0
Entering edit mode
10.7 years ago
Sheila ▴ 280

Given the Chip-seq dataset GSE46992, how to find that if the transcription factor considered in the study have binding sites close to a specific gene of interest (COX-1 in this case)?

Please suggest easy to use tool or web service !!

ChIP-Seq • 3.2k views
ADD COMMENT
2
Entering edit mode
10.7 years ago
Ming Tommy Tang ★ 4.5k

I think the easiest way is to upload the peak file (bed file) to UCSC genome browser, and manually check if there are any peaks near COX-1. other tools like PAVIS, annotateGenomicRanges, homer, bedtools, bedops and R packages ChIPpeakAnno and ChIPseeker can annotate the whole peak file.

ADD COMMENT
1
Entering edit mode
10.7 years ago

It'd be simplest to just use the ChIPpeakAnno package in Bioconductor. Alternatively, just import everything as GRanges objects and use the nearest() function with a bit of scripting.

ADD COMMENT
0
Entering edit mode

Thanks. If possible please post R code.

ADD REPLY
0
Entering edit mode

I'll just refer to their vignette, which has an example that seems to closely match what you want to do. Go ahead and make a new post if you run into any problems with that.

ADD REPLY
1
Entering edit mode
ADD REPLY
0
Entering edit mode

Interesting, thanks for pointing that out. I agree that it should take the strand of a peak (when known) into account and calculate distances accordingly. Has there been any movement on fixing this?

ADD REPLY
1
Entering edit mode

This is also my motivation of developing ChIPseeker, https://github.com/GuangchuangYu/ChIPseeker

ADD REPLY
0
Entering edit mode

It's unfortunate that you had to develop a package to get around this, but thanks for doing so!

ADD REPLY
0
Entering edit mode

ChIPseeker contains more features than ChIPpeakAnno. It's definitely better.

ADD REPLY
0
Entering edit mode

The author don't believe this is an issue as she replied in my blog post.

They don't fix this issue, you can refer to the supplemental file of http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btv145.

ADD REPLY
1
Entering edit mode
10.7 years ago

BEDOPS offers a tool for this called closest-features, which finds the nearest query element(s) to each of a set of reference elements. (In your use case, TF binding sites would be query elements, and your genes (say, TSSs) are your reference elements.)

It's very simple to use, and very fast, with a low memory profile. R and libraries often have a habit of loading everything into system memory, which can be a problem if you're working with large datasets.

To get your TFs ready, you can use the bedops set operation tool to filter your transcription factor set for TF binding sites that overlap ChIP-seq peaks or other regions. Take a look at the --element-of operation.

Then you might use closest-features to look for the nearest ChIP-seq-peak-overlapping-TF to each member of your set of, for example, gene transcription start sites (COX-1, etc.).

ADD COMMENT

Login before adding your answer.

Traffic: 3831 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6