Question

how to calculate with metadata of GRanges Objects

1

Entering edit mode

7.7 years ago

Assa Yeroslaviz ★ 1.9k

From a ChIPseq experiment we're having I'm reading the bigwig files as GRanges objects into R. One file for the IP sample, one for the Input. The peak values are saved as the score of the Granges object.

As we have two different control samples (input and IgG), I would like to calculate the scores based on the two controls. I ran macs2 for both comparisons (IP vs. input and Input vs. IgG) and would like to identify the peaks which are in the Input-IP comparison, but not in the Input-IgG comparison.

These are my files - For IP sample

   > InputIP.treat
GRanges object with 4083444 ranges and 1 metadata column:
            seqnames           ranges strand |             score
               <Rle>        <IRanges>  <Rle> |         <numeric>
        [1]        I         [ 1, 14]      * |                 0
        [2]        I         [15, 29]      * | 0.303319990634918
        ...      ...              ...    ... .               ...
  [4083443]      XVI [948064, 948065]      * | 0.606639981269836
  [4083444]      XVI [948066, 948066]      * | 0.303319990634918

and for the input file

> InputIP.control
GRanges object with 5658227 ranges and 1 metadata column:
            seqnames           ranges strand |            score
               <Rle>        <IRanges>  <Rle> |        <numeric>
        [1]        I     [   1, 6303]      * | 17.5913696289062
        [2]        I     [6304, 6304]      * | 17.6069602966309
         ...      ...              ...    ... .              ...
  [5658226]      XVI [947997, 947997]      * | 17.7014198303223
  [5658227]      XVI [947998, 948066]      * | 17.5913696289062

I also have similar files for the comparison IgG vs. Input. What I would like to do is the following. First calculate the ratio position-wise between IP and Input as well as between IgG and input and then calculate the ratio between the two ratios. To bring them to a position-wise ratio I convert the Granges Objects into GPos objects. the workflow is described below. For the first pair of IP-Input

InputIP.treat <- import.bedGraph(con = "../macs2Results.normalized/sizKO.InputIP.0h.1_treat_pileup.bdg")
gpos.InputIP.treat <- GPos(InputIP.treat)
gpos.InputIP.treat$score <- rep(InputIP.treat$score, width(InputIP.treat))
InputIP.control <- import.bedGraph(con = "../macs2Results.normalized/sizKO.InputIP.0h.1_control_lambda.bdg")  
gpos.InputIP.control <- GPos(InputIP.control)
gpos.InputIP.control$score <- rep(InputIP.control$score, width(InputIP.control))

#calculate the ratio of the Input-IP files
InputIP.ratio <- gpos.InputIP.control
InputIP.ratio$score <- gpos.InputIP.treat$score / gpos.InputIP.control$score

For the second pair of IgG-Input

InputIgG.treat <- import.bedGraph(con = "../macs2Results.normalized/sizKO.InputIgG.0h.2_treat_pileup.bdg")
gpos.InputIgG.treat <- GPos(InputIgG.treat)
gpos.InputIgG.treat$score <- rep(InputIgG.treat$score, width(InputIgG.treat))

InputIgG.control <- import.bedGraph(con = "../macs2Results.normalized/sizKO.InputIgG.0h.2_control_lambda.bdg")
gpos.InputIgG.control <- GPos(InputIgG.control)
gpos.InputIgG.control$score <- rep(InputIgG.control$score, width(InputIgG.control))

#calculate the ratio of the InputIgG files
InputIgG.ratio <- gpos.InputIgG.treat
InputIgG.ratio$score <- gpos.InputIgG.treat$score / gpos.InputIgG.control$score

But the problem I encountered is that the two pairs of files have different number of positions, so that I can't really compare them.

> InputIgG.ratio
GPos object with 12156972 positions and 1 metadata column:
             seqnames       pos strand |              score
                <Rle> <integer>  <Rle> |          <numeric>
         [1]       IX         1      * |                  0
         [2]       IX         2      * |                  0
         [3]       IX         3      * |                  0
         [4]       IX         4      * | 0.0302902583767201
         [5]       IX         5      * | 0.0302902583767201
         ... 
> InputIgG.ratio
GPos object with 12156941 positions and 1 metadata column:
             seqnames       pos strand |              score
                <Rle> <integer>  <Rle> |          <numeric>
         [1]       IX         1      * |                  0
         [2]       IX         2      * |                  0
         [3]       IX         3      * |                  0
         [4]       IX         4      * |                  0
         [5]       IX         5      * |                  0
         ...

So this is not the way to do it. I would appreciate any help in solving this problem. The goal is to take the two pairs of input files (either bigwig or bedgraph) and calculate the ratio between the two conditions, then calculate the ratio between the two ratios. But the problem is how to make the two files comparable.

thanks in advance

Assa

the four files for the examples can be found here:

genomicranges rtracklayer bigwig granges macs2 • 2.7k views

ADD COMMENT • link 7.7 years ago by Assa Yeroslaviz ★ 1.9k

0

Entering edit mode

I would look up, where these additional 31 positions in InputIgG.ratio come from. Maybe add them to InputIgG.ratio with score=0?