How to extract bigWig signal for a given bed file?
3
1
Entering edit mode
8.1 years ago
Bioradical ▴ 60

I'm looking for a program to extract signal values from a bigWig file for a given set of coordinates.

Python, R or command line is fine (my programming is rudimentary at best). Thank you!

bigWig • 11k views
ADD COMMENT
6
Entering edit mode
8.1 years ago

In python:

import pyBigWig
bw = pyBigWig.open("some_file.bw")
for line in open("foo.bed"):
    cols = line.strip().split()
    vals = bw.values(cols[0], int(cols[1]), int(cols[2]))
    # Do something with the values...
bw.close()
ADD COMMENT
1
Entering edit mode

cool library... does this also takes care about negative and positive strands from bed file?

ADD REPLY
0
Entering edit mode

There's nothing special about the minus strand, it has the same values in a bigWig file.

ADD REPLY
1
Entering edit mode

I think it will be helpful if one is plotting the nucleosome signal around TSS or TTS. As some genes has reverse orientation (negative strands in gff file), there data needs to be reversed accordingly before plotting. Correct me if I am wrong.

ADD REPLY
0
Entering edit mode

Tools like computeMatrix in deepTools handle such things automatically (deepTools uses the pyBigWig library under the hood).

ADD REPLY
1
Entering edit mode

Ohh... thanks I have used it, but completely forgotten.

ADD REPLY
0
Entering edit mode

Smooth as hell :) Great work Devon!

ADD REPLY
1
Entering edit mode

Are you done with the thesis?

ADD REPLY
1
Entering edit mode

I was going to yell that just now when I saw him walk into a talk :)

ADD REPLY
1
Entering edit mode

Looks like @John wants to stay a student .. forever.

Not smooth, @John :)

ADD REPLY
0
Entering edit mode

Hehehe, hasn't there been enough bioinformatician-abuse for one day? :P

For the record, my answer would have been "uhhh..."

ADD REPLY
1
Entering edit mode

You're a grad student, there's never enough grad student abuse :)

ADD REPLY
0
Entering edit mode

Hi Devon, Thank you very much for this script. I added three lines where you have the comment, 'Do something with the values':

ATG =  cols[3]           # get the identifier from the bed file 
span = sum(vals)         # sum all the values in vals
print(ATG,"\t",span,"\n")

This prints out a list of ATG numbers and a number. For example,

AT5G27350 6375.189992427826

However, some ATG numbers have 'nan' instead of a number, such as :

AT5G64667 nan

I was wondering if the reason why 'nan' is printed is because that in the list there may be 1 or 2 nan's, so the sum for the whole list is 'nan' ?
When I put 'nan' in a list I created and try to sum all value in this list with sum(mylist), I get an error, so I don't know how 'nan' is working in the bw.values() function.
If there is just a single 'nan' in the vals, then does that mean a sum of the vals would be 'nan' ? If this is true, is there a way to replace the 'nan' with a 0 ?

ADD REPLY
0
Entering edit mode

You'll want np.nansum().

ADD REPLY
0
Entering edit mode

Great ! Thank you very much. Works very well. Numbers for all the ATGs (genes).

ADD REPLY
0
Entering edit mode

Hi, I have been struggling to find a way to extract signal from a bw file. And this code is perfect for what I want to do. However, I do not know python. I would like to do this on Linux terminal.

Could you share the code that replicates the above in linux?

Thanks

ADD REPLY
0
Entering edit mode

Type that into a file, pip install --user pyBigWig and then execute the file you saved.

ADD REPLY
2
Entering edit mode
8.1 years ago

Altrenatively, use bigWigToBedGraph from UCSC genome browser at http://hgdownload.soe.ucsc.edu/admin/exe/, just pick the compiled version that suites you. Usage:

bigWigToBedGraph - Convert from bigWig to bedGraph format.
usage:
   bigWigToBedGraph in.bigWig out.bedGraph
options:
   -chrom=chr1 - if set restrict output to given chromosome
   -start=N - if set, restrict output to only that over start
   -end=N - if set, restict output to only that under end
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs
ADD COMMENT
0
Entering edit mode

In bedgraph format, the fourth column is == to the signal?

ADD REPLY
0
Entering edit mode

Yes, it's whatever signal is stored in the bigWig file.

ADD REPLY
0
Entering edit mode
5.9 years ago
chechu086 • 0

You also can use the BigWigAverageOverBed program from UCSC. The usage: bigWigAverageOverBed in.bw in.bed out.tab

ADD COMMENT

Login before adding your answer.

Traffic: 2303 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6