I'm analyzing some CTCF ChIP-seq data, i'm interested in recording the orientation of CTCF sites as they have been shown to have important roles in the underlying biology. I can't seem to find any information on how to do this, despite it being fairly popular. Perhaps just not using the right search terms. Any ideas?
This will report at most one match per peak, with an estimated FPR of 1% based on random genomic sequences. The strand column in the BED output will tell you the direction of the motif.
Even if an old answer, I am using it for my purposes. I want to have a final bed file with CTCF colour coded annotation according to the motif orientation on the genome. But I keep having problems with the code.
I am using this command:
gimme scan MK_CTCF_From_Romina_hg38_c10.0_l245_g100_peaks -p CTCF.pwm -g hg38 -b > CTCF_motifs.bed
>CTCF_known1 CTCF_1 CTCF_jaspar_MA0139.1
Y 0.095290 0.318729 0.083242 0.502738
D 0.182913 0.158817 0.453450 0.204819
R 0.307777 0.053669 0.491785 0.146769
C 0.061336 0.876232 0.023001 0.039430
C 0.008762 0.989047 0.000000 0.002191
A 0.814896 0.014239 0.071194 0.099671
S 0.043812 0.578313 0.365827 0.012048
Y 0.117325 0.474781 0.052632 0.355263
A 0.933114 0.012061 0.035088 0.019737
G 0.005488 0.000000 0.991218 0.003293
R 0.365532 0.003293 0.621295 0.009879
K 0.059276 0.013172 0.553238 0.374314
G 0.013187 0.000000 0.978022 0.008791
G 0.061538 0.008791 0.851648 0.078022
C 0.114411 0.806381 0.005501 0.073707
R 0.409241 0.014301 0.557756 0.018702
S 0.090308 0.530837 0.338106 0.040749
Y 0.128855 0.354626 0.080396 0.436123
V 0.442731 0.199339 0.292952 0.064978
I get this error message:
Traceback (most recent call last):
File "/Users/luca/anaconda3/envs/gimme/bin/gimme", line 513, in <module>
args.func(args)
File "/Users/luca/anaconda3/envs/gimme/lib/python3.6/site-packages/gimmemotifs/commands/pwmscan.py", line 170, in pwmscan
normalize=args.zscore,
File "/Users/luca/anaconda3/envs/gimme/lib/python3.6/site-packages/gimmemotifs/commands/pwmscan.py", line 113, in command_scan
fa = as_fasta(inputfile, genome)
File "/Users/luca/anaconda3/envs/gimme/lib/python3.6/site-packages/gimmemotifs/utils.py", line 613, in as_fasta
genome.track2fasta(seqs, tmpfa.name)
File "/Users/luca/anaconda3/envs/gimme/lib/python3.6/site-packages/genomepy/functions.py", line 466, in track2fasta
track_type = get_track_type(track)
File "/Users/luca/anaconda3/envs/gimme/lib/python3.6/site-packages/genomepy/functions.py", line 231, in get_track_type
if isinstance(track, []):
TypeError: isinstance() arg 2 must be a type or tuple of types
This is excellent, thank you very much!
Even if an old answer, I am using it for my purposes. I want to have a final bed file with CTCF colour coded annotation according to the motif orientation on the genome. But I keep having problems with the code.
I am using this command:
gimme scan MK_CTCF_From_Romina_hg38_c10.0_l245_g100_peaks -p CTCF.pwm -g hg38 -b > CTCF_motifs.bed
this is the structure of my bed file:
and this is the CTCF.pwm
I get this error message:
Do you have any suggestions to sort this out?
Hello lu, your bed file doesn't look like a standard bed format, you can check the standard bed format on https://genome.ucsc.edu/FAQ/FAQformat.html#format1, also you can see the example of how to use gimme scan on https://gimmemotifs.readthedocs.io/en/master/tutorials.html#scan-for-known-motifs
Hello yztxwd, Thanks for your suggestions
Hi, Where to find mouse CTCF pwm file?
I found no mouse file on JASPAR website. So do CTCF binding motifs the same in mouse as in human?