I want to overlap structural variants to ENCODE narrowPeak data. I have a few questions
- Anisogenic replicate
I see under one experiment there are two anisogenic replicates. This means that the two replicates are different individuals (having different genomes?)
- Combining replicates
This is something I should consider? If there are X number of replicates for a specific tissue I'm interested in should I apply a normalization for the signal
I'm also unsure on how to combine the replicates. This is what I'm thinking of doing.
- Calculate Z-score for all scores listed in each individual narrowPeak file (column number 7: signal value)
- Overlap the replicates and take the common positions.
- For each common position among the replicates call the region as DNase hypersensitive if and only if 75% of the replicates have a Z-score > 3 (or 2?)
Then when I overlap these common peaks to my structural variants I can count the number of peaks and the total bases spanned for each variant.
I am unfamiliar in the world of ENCODE and noncoding regulation. Please tear me to shreds (or direct me to papers) if you find my proposal weak.