Summit Extension in Cut&tag analysis
1
0
Entering edit mode
14 months ago
black_blue • 0

I am doing a cut-tag analysis to find the motif of a protein(a transcription factor),and I have got narrowPeak file and summit file from macs2,and now I want to find motif via meme-chip,the question is:

  1. Should I use extended sequences from summits for motif finding?
  2. If my TF is very large and may combine a very long dna regions(The exact value is not clear),how long an extended sequence should be used(e.g. bedtools slop may be help)?
  3. If my TF is very large, could I use the whole peak region in narrowPeak file for motif finding?
macs2 CHIP-seq CutAndTag motif • 783 views
ADD COMMENT
0
Entering edit mode

Does this exist, "large" TF? I mean, a protein can be large, but that does not mean that the motif is large. For TFs I have ever stumbled upon it is usually less than 12 bp. I would simply extend by something like 50bp each direction, so total 100bp and go along with that. You can check repositories such as JASPAR or HOCOMOCO and see what the "longest"/"largest" TF is they have, but I haven't heard of overly long motifs, but I am not working on TFs for years now, so others might know better.

ADD REPLY
0
Entering edit mode

thank you first. as you said,the protein is large but motif is usually less than 12bp. in fact,previous study has show the TF may work with different proteins in different experments,and the hypothesis in my exp is that the TF may cooperate with mutiple TFs and then different motifs might be enrichment in the peaks,and actually i have got many long peaks. In addition,multiple binding modes might result in binding sites not in the center?that is also what i am concerned.

ADD REPLY
0
Entering edit mode

I think you should read the literature on composite motifs, for example IRF8 with AP1. These motifs are still "short" in terms of 10-20bp. I would really try to keep the ranges small, to avoid scanning regions that are unlikely to contain actual binding motifs. macs2 has an option to call summits, maybe use that to deconvolute large binding sites. Or do ATAC-seq on top, to identify regions that are actually accessable, and narrow the binding sites down.

ADD REPLY
0
Entering edit mode

Can i take it that it is more suggested to lose true positives rather than increase false negatives in the situation?

ADD REPLY

Login before adding your answer.

Traffic: 1101 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6