convert bedgraph to Granges object
1
0
Entering edit mode
2.3 years ago
storm1907 ▴ 30

Hello,

I have bedgraph files in following format:

track type=bedGraph description=center_label visibility=full graphType=points color=200,100,0 altColor=0,100,200            
chr1  923391  923526  -0.4838883 
chr1  924813  925002  -0.45778788 
chr1  930090  930401  -0.09189046 
chr1  935707  935961  1.25254348 
chr1  939207  939525  3.51303879

Is there any tool, with whom I can convert bedgraph to Granges object? Desired format is something like this:

> grl.data
GRangesList object of length 10:
$s1
GRanges object with 1318 ranges and 1 metadata column:
         seqnames      ranges strand |     value
            <Rle>   <IRanges>  <Rle> | <numeric>
     [1]        1       1-576      * |   2.03030
     [2]        1    577-1112      * |   1.94532
     [3]        1   1113-1511      * |   1.92982
     [4]        1   1512-2113      * |   1.86865
     [5]        1   2114-2573      * |   1.93882
     ...      ...         ...    ... .       ...
  [1314]       22 19081-19153      * |   2.00131
  [1315]       22 19154-20059      * |   1.95003
  [1316]       22 20060-21926      * |   2.00952
  [1317]       22 21927-23554      * |   1.95225
  [1318]       22 23555-24465      * |   2.01210

Currently I was not able to find any related solution, so will be happy for any help

Thank you!

bedgraph • 2.2k views
ADD COMMENT
2
Entering edit mode
2.3 years ago

the import function from rtracklayer

ADD COMMENT
0
Entering edit mode

I work on remote Linux server, and have difficulties with installing and running rtracklayer. Will try to do it on my local machine, thanks

ADD REPLY
0
Entering edit mode

If you don't want to install rtracklayer you could also build the GRanges semi-manually.

library("GenomicRanges")

bg <- read.table("file.bedgraph", sep="\t", skip=1)
colnames(bg) <- c("seqnames", "start", "end", "value")

gr <- makeGRangesFromDataFrame(bg, ignore.strand=TRUE, keep.extra.columns=TRUE)
ADD REPLY
0
Entering edit mode

Thank you, this works for one sample! What could be appropriate approach, if I need to make one Grange list object from multiple bedgraphs?

ADD REPLY
0
Entering edit mode
library("GenomicRanges")

# A vector of bedGraph files.
# Can be constructed with list.files for convenience.
bg_files <- c("file1.bedgraph", "file2.bedgrah")

# Load the bedgraphs files into a list of GRanges.
bgs <- lapply(bg_files, function(x) {
  bg <- read.table(x, sep="\t", skip=1)
  colnames(bg) <- c("seqnames", "start", "end", "value")
  gr <- makeGRangesFromDataFrame(bg, ignore.strand=TRUE, keep.extra.columns=TRUE)
  return(gr)
})

# Convert the list of GRanges into a GRangesList.
gr_list <- GRangesList(bgs)
ADD REPLY
0
Entering edit mode

Thank you!

At first it is working, however when trying to run R package CINdex, I still get error

> gr_list
GRangesList object of length 4:
[[1]]
GRanges object with 200255 ranges and 1 metadata column:
           seqnames            ranges strand |      value
              <Rle>         <IRanges>  <Rle> |  <numeric>
       [1]     chr1     923391-923526      * |   0.544787
       [2]     chr1     923551-924817      * |   0.565173
       [3]     chr1     924813-925002      * |   1.072250
       [4]     chr1     925877-926078      * |  -0.511750
       [5]     chr1     930090-930401      * |   0.270485
       ...      ...               ...    ... .        ...
  [200251]    chr22 50768711-50768939      * |  1.2515725
  [200252]    chr22 50769840-50770081      * |  0.2780226
  [200253]    chr22 50775707-50775916      * |  0.0372048
  [200254]    chr22 50776605-50776814      * | -0.2125530
  [200255]    chr22 50777887-50778046      * |  0.3990656
  -------
  seqinfo: 22 sequences from an unspecified genome; no seqlengths

...
<3 more elements>
> run.cin.chr(gr_list)
Error in segments.to.profile(segments) :
  Wrong representation of segments.
In addition: Warning message:
In dir.create(out.folder.name) : 'output_chr_cin' already exists
ADD REPLY
0
Entering edit mode

Hello again,

This is the result of my commands (my_gr). However, is it possible to get result as in the lower sample (grl.data)?

enter image description here

in grl.data file start there are two additional lines:

GRangesList object of length 10:
$s1

When I try to run downstream analysis, everything is ok with file grl.data, but for my_gr I get output:

> run.cin.chr(grl.seg = ivf_gr)
Error in run.cin.chr(grl.seg = ivf_gr) :
  Input 'grl.seg' must be a GRangesList object

grl.data is a sample file, provided by CINdex package

ADD REPLY
0
Entering edit mode

OK, I found this function for making Grange lists: https://rdrr.io/bioc/GenomicRanges/man/makeGRangesListFromDataFrame.html

However, still the question, how to make one Grange object list from multiple bedgraphs

I also tried to make Granges list object from 1 bedgraph file

enter image description here

Trying to run analysis with this new file I got following error:

> run.cin.chr(grl.seg = ivf_gr1)
Error in segments.to.profile(segments) :
  Wrong representation of segments.
In addition: Warning message:
In dir.create(out.folder.name) : 'output_chr_cin' already exists
ADD REPLY
0
Entering edit mode

Stop adding answers unless you're answering the top level post. Use Add Comment/Add Reply instead. I've cleaned up the post now but please be more careul in the future.

ADD REPLY
0
Entering edit mode

Please do not paste screenshots of plain text content, it is counterproductive. You can copy paste the content directly here (using the code formatting option shown below), or use a GitHub Gist if the content volume exceeds allowed length here.

code_formatting

ADD REPLY

Login before adding your answer.

Traffic: 2121 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6