motif in set of genes
2
1
Entering edit mode
7.1 years ago
rob.costa1234 ▴ 310

I have identified a motif from my chipseq experiment and want to find location of binding sites of such motif in my 10 genes of interests. What should be the best way or tool to do this task.

Thanks

ChIP-Seq • 2.8k views
ADD COMMENT
0
Entering edit mode

Version 4.12.0, hg19 default settings

ADD REPLY
0
Entering edit mode

HI,

What if I don't have "positional frequency matrix".

ADD REPLY
0
Entering edit mode
7.1 years ago

Convert the motif to a MEME representation, unless you have this already. Use this MEME matrix with FIMO to find hits or binding sites across the genome at a specified threshold. Use BEDOPS bedmap to map these binding sites to gene annotations, converted to BED with BEDOPS convert2bed, if necessary.

ADD COMMENT
0
Entering edit mode
Motif ID    Alt ID  Sequence Name   Strand  Start   End     p-value     q-value     Matched Sequence
2       16  -   52723   52751   2.77e-19    4.86e-14    CTCTGTCGCCCAGGCTGGAGTGCAGTGGC
2       17  -   78101   78129   2.77e-19    4.86e-14    CTCTGTCGCCCAGGCTGGAGTGCAGTGGC
2       17  -   100740  100768  2.77e-19    4.86e-14    CTCTGTCGCCCAGGCTGGAGTGCAGTGGC

Above are few lines of FIMO output it does not give any gene names, or exact coordinates. The coordinates it is showing are from the fasta seq that has been extracted 2500 bp TSS. so what should be intersected? If I start intersecting Sequence I think that may not be very efficient and correct way of doing. One seq may be present in multiple location then which p value should I assign to that match. Thanks

ADD REPLY
0
Entering edit mode

Your FIMO output should look like this, I think:

http://meme-suite.org/doc/examples/fimo_example_output_files/fimo.txt

You can convert that style of output to sorted BED:

$ awk -v OFS="\t" '{ print $3, $4, $5, $1, $8, $6 }' | sort-bed - > fimo.bed

Can you take another look and confirm?

ADD REPLY
0
Entering edit mode
 # motif_id motif_alt_id    sequence_name   start   stop    strand  score   p-value q-value matched_sequence
2       KI270742.1  15757   15785   -   49.5258 2.77e-19    4.86e-14    CTCTGTCGCCCAGGCTGGAGTGCAGTGGC
2       KI270755.1  21577   21605   -   49.5258 2.77e-19    4.86e-14    CTCTGTCGCCCAGGCTGGAGTGCAGTGGC
2       KI270714.1  22078   22106   -   49.5258 2.77e-19    4.86e-14    CTCTGTCGCCCAGGCTGGAGTGCAGTGGC
2       KI270719.1  22816   22844   -   49.5258 2.77e-19    4.86e-14    CTCTGTCGCCCAGGCTGGAGTGCAGTGGC
2       KI270746.1  34414   34442   -   49.5258 2.77e-19    4.86e-14    CTCTGTCGCCCAGGCTGGAGTGCAGTGGC
2       16  44269   44297   -   49.5258 2.77e-19    4.86e-14    CTCTGTCGCCCAGGCTGGAGTGCAGTGGC

This is what I am getting

ADD REPLY
0
Entering edit mode

This does not look familiar to me. What version of FIMO are you using, what settings are you passing in, what reference genome are you working with, etc.?

ADD REPLY
0
Entering edit mode

It looks like your output is missing fields. I'm not sure what's wrong.

ADD REPLY
0
Entering edit mode

fimo --oc . --verbosity 1 --bgfile db/ucsc_hg19.fna.bfile --thresh 1.0E-4 motifs.meme db/ucsc_hg19.fna

The command for running FIMO, If that is helpful

ADD REPLY
0
Entering edit mode
7.1 years ago

You can use GimmeMotifs (disclaimer: I am the author). You will have to use a motif representation (positional frequency matrix) that looks like this:

>motif_name
0    0    0    1
0.2    0    0    0.8

Then you can use gimme scan to scan with this motif:

$  gimme scan sequences.fa -p motif.pwm -g hg19 -b

Here the -b argument specifies BED output. See the full documentation of gimme scan here.

ADD COMMENT

Login before adding your answer.

Traffic: 1955 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6