From a ChIPseq experiment we're having I'm reading the bigwig files as GRanges
objects into R. One file for the IP sample, one for the Input. The peak values are saved as the score of the Granges object.
As we have two different control samples (input and IgG), I would like to calculate the scores based on the two controls. I ran macs2
for both comparisons (IP vs. input and Input vs. IgG) and would like to identify the peaks which are in the Input-IP comparison, but not in the Input-IgG comparison.
These are my files - For IP sample
> InputIP.treat
GRanges object with 4083444 ranges and 1 metadata column:
seqnames ranges strand | score
<Rle> <IRanges> <Rle> | <numeric>
[1] I [ 1, 14] * | 0
[2] I [15, 29] * | 0.303319990634918
... ... ... ... . ...
[4083443] XVI [948064, 948065] * | 0.606639981269836
[4083444] XVI [948066, 948066] * | 0.303319990634918
and for the input file
> InputIP.control
GRanges object with 5658227 ranges and 1 metadata column:
seqnames ranges strand | score
<Rle> <IRanges> <Rle> | <numeric>
[1] I [ 1, 6303] * | 17.5913696289062
[2] I [6304, 6304] * | 17.6069602966309
... ... ... ... . ...
[5658226] XVI [947997, 947997] * | 17.7014198303223
[5658227] XVI [947998, 948066] * | 17.5913696289062
I also have similar files for the comparison IgG vs. Input. What I would like to do is the following. First calculate the ratio position-wise between IP and Input as well as between IgG and input and then calculate the ratio between the two ratios. To bring them to a position-wise ratio I convert the Granges Objects into GPos objects. the workflow is described below. For the first pair of IP-Input
InputIP.treat <- import.bedGraph(con = "../macs2Results.normalized/sizKO.InputIP.0h.1_treat_pileup.bdg")
gpos.InputIP.treat <- GPos(InputIP.treat)
gpos.InputIP.treat$score <- rep(InputIP.treat$score, width(InputIP.treat))
InputIP.control <- import.bedGraph(con = "../macs2Results.normalized/sizKO.InputIP.0h.1_control_lambda.bdg")
gpos.InputIP.control <- GPos(InputIP.control)
gpos.InputIP.control$score <- rep(InputIP.control$score, width(InputIP.control))
#calculate the ratio of the Input-IP files
InputIP.ratio <- gpos.InputIP.control
InputIP.ratio$score <- gpos.InputIP.treat$score / gpos.InputIP.control$score
For the second pair of IgG-Input
InputIgG.treat <- import.bedGraph(con = "../macs2Results.normalized/sizKO.InputIgG.0h.2_treat_pileup.bdg")
gpos.InputIgG.treat <- GPos(InputIgG.treat)
gpos.InputIgG.treat$score <- rep(InputIgG.treat$score, width(InputIgG.treat))
InputIgG.control <- import.bedGraph(con = "../macs2Results.normalized/sizKO.InputIgG.0h.2_control_lambda.bdg")
gpos.InputIgG.control <- GPos(InputIgG.control)
gpos.InputIgG.control$score <- rep(InputIgG.control$score, width(InputIgG.control))
#calculate the ratio of the InputIgG files
InputIgG.ratio <- gpos.InputIgG.treat
InputIgG.ratio$score <- gpos.InputIgG.treat$score / gpos.InputIgG.control$score
But the problem I encountered is that the two pairs of files have different number of positions, so that I can't really compare them.
> InputIgG.ratio
GPos object with 12156972 positions and 1 metadata column:
seqnames pos strand | score
<Rle> <integer> <Rle> | <numeric>
[1] IX 1 * | 0
[2] IX 2 * | 0
[3] IX 3 * | 0
[4] IX 4 * | 0.0302902583767201
[5] IX 5 * | 0.0302902583767201
...
> InputIgG.ratio
GPos object with 12156941 positions and 1 metadata column:
seqnames pos strand | score
<Rle> <integer> <Rle> | <numeric>
[1] IX 1 * | 0
[2] IX 2 * | 0
[3] IX 3 * | 0
[4] IX 4 * | 0
[5] IX 5 * | 0
...
So this is not the way to do it. I would appreciate any help in solving this problem. The goal is to take the two pairs of input files (either bigwig or bedgraph) and calculate the ratio between the two conditions, then calculate the ratio between the two ratios. But the problem is how to make the two files comparable.
thanks in advance
Assa
the four files for the examples can be found here:
I would look up, where these additional 31 positions in InputIgG.ratio come from. Maybe add them to InputIgG.ratio with score=0?