Question

How to correctly apply -len/fragLength parameter in Homer's annotatePeaks.pl ?

0

Entering edit mode

2.4 years ago

Aspire ▴ 370

Homer's annotatePeaks.pl centers tags for counting, based on their estimated ChIP-fragment lengths. To avoid that, the -len/fragLength parameter can be set to 1 (as is useful in the case of 5' RNA).

This can be seen by running annotatePeaks.pl (can be downloaded from here) without any parameters, to see its help.

However, I have found that changing it by setting the -len parameter from auto to 1 influences the results by a magnitude (in my case).

With -len auto:

annotatePeaks.pl Tiles.pos GRCm38 -norm 1e8 -size 2000 -len auto -hist 50 -ghist -d GSE124804_tagDir/ | head -2 | tail -1 

1604    1.43958868894602        1.43958868894602        1.72750642673522        1.87146529562982        1.72750642673522        1.87146529562982  2.59125964010283        2.87917737789203        2.87917737789203        2.59125964010283        2.73521850899743 2.73521850899743 2.73521850899743        3.02313624678663        2.73521850899743        2.87917737789203        3.74293059125964 4.17480719794345 4.46272493573265        4.31876606683805        4.03084832904884        5.18251928020566        5.03856041131105 5.47043701799486 4.75064267352185        3.88688946015424        3.31105398457584        3.02313624678663        3.31105398457584 3.59897172236504 2.44730077120823        1.72750642673522        2.01542416452442        1.87146529562982        2.44730077120823 2.30334190231363 2.15938303341902        2.15938303341902        2.15938303341902        2.15938303341902        2.15938303341902

with -len 1 :

annotatePeaks.pl Tiles.pos GRCm38 -norm 1e8 -size 2000 -len 1 -hist 50 -ghist -d GSE124804_tagDir/ | head -2 | tail -1

1604    56      0       0       0       112     56      280     168     112     56      224     56      56      168     224     112       168     -5.6843418860808e-14    224     112     -7.105427357601e-14     560     224     448     504     280     55.9999999999997  -2.91322521661641e-13   -2.91322521661641e-13   55.9999999999997        55.9999999999997        55.9999999999997        112       -2.98427949019242e-13   224     168     112     280     55.9999999999996        55.9999999999996        112

What can explain it?

The tagDir is base on a bam file of paired-end reads, the 9th field of bam file (fragment length) ranges from about -450 to +450

homer • 640 views

ADD COMMENT • link updated 2.4 years ago by Matthias Zepper 5.1k • written 2.4 years ago by Aspire ▴ 370

1

Entering edit mode

I would presume that is due to your normalization?

The mapped read is usually extended to the presumed ChIP-fragment length of several hundred base pairs. By limiting it to 1bp, you will lose >99,5% of your total counts. If you still normalize to 1e8 total counts, the remaining counts will be multiplied with the respective factors to make up for this loss.

ADD REPLY • link 2.4 years ago by Matthias Zepper 5.1k