Can't get known motif enrichment result using findMotifs.pl (Homer)
1
0
Entering edit mode
4.9 years ago
youllae • 0

Hi. I'm currently using HOMER to see known motif enrichment of the list of DEGs I have. Although I followed all the instruction provided in this page (http://homer.ucsd.edu/homer/microarray/index.html) it never seem to get the result correctly. I've tried using Entrez Gene ID instead of TAIR ID, and installed all packages related to arabidopsis and tair10. Can anyone tell me how to solve this problem?

Selected Options: Input file = (file_url).txt Promoter Set = arabidopsis Output Directory = (output_url) Found mset for "arabidopsis", will check against plants motifs

Progress: Step1 - Convert input file to refseq IDs
Percentage of IDs converted into refseq: 97.9% (894 out of 913)

Progress: Step2 - prepare sequence files

Progress: Step3 - creating foreground/background file

Progress: Step4 - removing redundant promoters
    Kept 52998 of 56329

Progress: Step5 - adjusting background sequences for GC/CpG content...
Bin # Targets   # Background    Background Weight

Normalizing lower order oligos using homer2

Reading input files...
0 total sequences read
Autonormalization: 1-mers (4 total)
    A   inf%    inf%    -nan
    C   inf%    inf%    -nan
    G   inf%    inf%    -nan
    T   inf%    inf%    -nan
Autonormalization: 2-mers (16 total)
    AA  inf%    inf%    -nan
    CA  inf%    inf%    -nan
    GA  inf%    inf%    -nan
    TA  inf%    inf%    -nan
    AC  inf%    inf%    -nan
    CC  inf%    inf%    -nan
    GC  inf%    inf%    -nan
    TC  inf%    inf%    -nan
    AG  inf%    inf%    -nan
    CG  inf%    inf%    -nan
    GG  inf%    inf%    -nan
    TG  inf%    inf%    -nan
    AT  inf%    inf%    -nan
    CT  inf%    inf%    -nan
    GT  inf%    inf%    -nan
    TT  inf%    inf%    -nan
Autonormalization: 3-mers (64 total)
Normalization weights can be found in file: (output_url)/seq.autonorm.tsv
Converging on autonormalization solution:
...............................................................................
Final normalization:    Autonormalization: 1-mers (4 total)
    A   inf%    inf%    -nan
    C   inf%    inf%    -nan
    G   inf%    inf%    -nan
    T   inf%    inf%    -nan
Autonormalization: 2-mers (16 total)
    AA  inf%    inf%    -nan
    CA  inf%    inf%    -nan
    GA  inf%    inf%    -nan
    TA  inf%    inf%    -nan
    AC  inf%    inf%    -nan
    CC  inf%    inf%    -nan
    GC  inf%    inf%    -nan
    TC  inf%    inf%    -nan
    AG  inf%    inf%    -nan
    CG  inf%    inf%    -nan
    GG  inf%    inf%    -nan
    TG  inf%    inf%    -nan
    AT  inf%    inf%    -nan
    CT  inf%    inf%    -nan
    GT  inf%    inf%    -nan
    TT  inf%    inf%    -nan
Autonormalization: 3-mers (64 total)

Progress: Step6 - Gene Ontology Enrichment Analysis

Progress: Step7 - Known motif enrichment

Reading input files...
0 total sequences read
506 motifs loaded
Cache length = 11180
Using hypergeometric scoring
Checking enrichment of 506 motif(s)
|0%                                    50%                                  100%|
=================================================================================

Illegal division by zero at (software_url)/bin/findKnownMotifs.pl line 152.

Progress: Step8 - De novo motif finding (HOMER)

Scanning input files...

!!! Something is wrong... are you sure you chose the right length for motif finding? !!! i.e. also check your sequence file!!!

Scanning input files...

!!! Something is wrong... are you sure you chose the right length for motif finding? !!! i.e. also check your sequence file!!!

-blen automatically set to 2
Scanning input files...

!!! Something is wrong... are you sure you chose the right length for motif finding? !!! i.e. also check your sequence file!!! Use of uninitialized value in numeric gt (>) at (software_url)/bin/compareMotifs.pl line 1389. !!! Filtered out all motifs!!! Job finished

software error • 2.3k views
ADD COMMENT
0
Entering edit mode

youllae, please provide some follow up, if possible

ADD REPLY
1
Entering edit mode
4.9 years ago
zubenel ▴ 120

It seems that there is something wrong with gene id conversion. You may try this: use input file with gene names in TAIR format and add option -noconvert. In this way ids will not be converted.

I ran analysis with small list of randomly chosen genes:

AT3G44150
AT4G36350
AT5G48910
ATCG00170
AT1G46336
AT2G05410
AT2G44255
AT3G42713
AT4G08460
AT1G12667

From the lines of Perl program output it seems that analysis worked:

    Autonormalization: 2-mers (16 total)
        AA  13.45%  12.71%  1.058
        CA  4.91%   5.76%   0.851
        GA  6.09%   5.95%   1.023
        TA  10.08%  9.08%   1.110
        AC  5.07%   5.19%   0.977
        CC  2.43%   2.95%   0.823
        GC  2.46%   2.41%   1.020
        TC  6.09%   5.95%   1.023
        AG  6.50%   5.52%   1.176
        CG  1.57%   2.25%   0.700
        GG  2.43%   2.95%   0.823
        TG  4.91%   5.76%   0.851
        AT  9.00%   10.08%  0.893
        CT  6.50%   5.52%   1.176
        GT  5.07%   5.19%   0.977
        TT  13.45%  12.71%  1.058
    Autonormalization: 3-mers (64 total)

    Progress: Step6 - Gene Ontology Enrichment Analysis

    Progress: Step7 - Known motif enrichment

    Reading input files...
    24167 total sequences read
    506 motifs loaded
    Cache length = 11180
    Using hypergeometric scoring
    Checking enrichment of 506 motif(s)

Also:

Progress: Step8 - De novo motif finding (HOMER)

    Scanning input files...
    Parsing sequences...
    |0%                                   50%                                  100%|
    ================================================================================
    Total number of Oligos: 32874
    Autoadjustment for sequence coverage in background: 1.09x

    Oligos: 32874 of 34497 max
    Tree  : 66660 of 172485 max
    Optimizing memory usage...
    Cache length = 11180
    Using hypergeometric scoring

    Global Optimization Phase: Looking for enriched oligos with up to 1 mismatches...

    Screening oligos 32874 (allowing 0 mismatches):
    |0%                                   50%                                  100%|
    ================================================================================
        94.03% skipped, 5.97% checked (1962 of 32874), of those checked:
        94.03% not in target, 0.00% increased p-value, 0.00% high p-value

    Screening oligos 32874 (allowing 1 mismatches):
    |0%                                   50%                                  100%|
    ================================================================================
        94.03% skipped, 5.97% checked (1962 of 32874), of those checked:
        0.00% not in target, 5.67% increased p-value, 1.08% high p-value
    Reading input files...
    24167 total sequences read
    Cache length = 11180
    Using hypergeometric scoring

The result with 10 bp length motifs was saved in homerMotifs.motifs10 file and the first lines of the file looks like this:

>AAGCTTWGAA 1-AAGCTTWGAA    10.724573   -10.744085  0   T:3.0(37.50%),B:371.1(0.73%),P:1e-4 Tpos:271.0,Tstd:74.6,Bpos:164.2,Bstd:141.4,StrandBias:-10.0,Multiplicity:1.00
0.997   0.001   0.001   0.001
0.719   0.001   0.001   0.279
0.001   0.001   0.997   0.001
0.001   0.763   0.001   0.235
0.001   0.001   0.001   0.997
0.001   0.001   0.001   0.997
0.517   0.001   0.001   0.481
0.199   0.231   0.558   0.012
0.722   0.276   0.001   0.001
0.997   0.001   0.001   0.001
>ACTGTTTTAA 2-ACTGTTTTAA    7.180883    -9.252319   0   T:3.0(37.50%),B:615.9(1.21%),P:1e-4 Tpos:242.5,Tstd:55.4,Bpos:167.1,Bstd:148.2,StrandBias:0.0,Multiplicity:1.33
0.997   0.001   0.001   0.001
0.001   0.997   0.001   0.001
0.001   0.001   0.001   0.997
0.001   0.001   0.997   0.001
0.189   0.001   0.001   0.809
0.001   0.001   0.001   0.997
0.001   0.189   0.001   0.809
0.001   0.189   0.001   0.809
0.997   0.001   0.001   0.001
0.809   0.001   0.189   0.001

As I am not used in working with homer check if this makes sense. Besides you can write to cbenner@ucsd.edu email as suggested on homer website

ADD COMMENT

Login before adding your answer.

Traffic: 2143 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6