I have single-end reads from ChIP-seq experiment from some histone marks. With the cross-correlation plots (from correlatedReads from scaw package I identifies the fragments length: app. 180 bp, read length is 50bp). I want to count features falling in a specific genomic location using featureCounts from rsubreads.
featureCounts has such options: readExtension5
, readExtension3
, minReadOverlap
.
More reasonable would be to count the fragments (extended reads) and not the reads themselves. Because it might happen that my genomic regions (which is a bit less than 180 bp) is occupied by lets say H3K4me1, meaning that the reads will be out of the defined region and this region will be reported as empty. On the contrary, if I extend my reads and count the fragments, I will definitely catch the fragments and my region will not be reported as empty. Is it correct?
So, I need to extend my reads. I think, I need to extend them downstream from 3' ( I am not quite sure why would somebody want to extend the reads upstream from 5' because) by app. 130 bp. Is it correct? Does featureCount take into consideration the strand of the read when extending downstream from 3'?
Let me know if I have some flaws in my thinking.