Question

How to generate BedGraph from BedTools Coverage?

2

Entering edit mode

9.1 years ago

scchess ▴ 640

The tool is at: http://bedtools.readthedocs.org/en/latest/content/tools/coverage.html

I want to get base coverage for specific regions (I'll provide a bed file). However, the output is not in a standard BedGraph format that I can export into R for visualisation.

Q: How do I draw density plot of coverage over the regions from the outputs generated by the coverage tool?

Q: Alternatively, how to convert the non-standard outputs to a proper BedGraph? For example, the Sushi R package takes a bedgraph input file.

genomic bed bedtools • 16k views

ADD COMMENT • link updated 2.2 years ago by Ram 44k • written 9.1 years ago by scchess ▴ 640

1

Entering edit mode

9.1 years ago

Alex Reynolds 35k

Get your output into a standard five- or more columned BED file and then use awk:

$ awk '{ print $1"\t"$2"\t"$3"\t"$5 }' foo.bed > foo.bedgraph

ADD COMMENT • link updated 2.2 years ago by Ram 44k • written 9.1 years ago by Alex Reynolds 35k

0

Entering edit mode

@ Alex: How can I convert my .sam file to standard .bed file? I have tried sam2bed < input.sam > output.bed it gives me output file but the column 5 always contains the same value (which is 99). According to UCSC bed format this is score column and to me this look strange that for every region the score is same. Then by using your mentioned command I obtain the .bedgraph file which I later use for ChromImpute. Problem is that when ChromeImpute convert these raw signal into certain sized bin signal (25 bp) then for every region I get the 0 signal.. I have tried many ways (using Pyisco, make wig files, genomeCoverageBed etc..) but I am unable to solve this problem. Can you help me with this one? Thanks.

ADD REPLY • link 7.6 years ago by Bioinformatist Newbie ▴ 270

0

Entering edit mode

9.1 years ago

James Ashmore ★ 3.5k

To generate bedGraph output from Bedtools coverage command you need to specify the -counts flag, for example:

bedtools coverage -a sample.bed -b sample.bam -counts > sample.bedgraph

Strictly speaking each row of a bedGraph file contains the chromosome name, the start position, the end position and then some value, which in your case would be the coverage, therefore you will want to keep only these fields, so if your had a bed file with six columns, you would want to keep only the first 3 columns (chrom, start, end) and the last column (coverage):

bedtools coverage -a sample.bed -b sample.bam -counts | cut -f1,2,3,7 > sample.bedgraph

ADD COMMENT • link 9.1 years ago by James Ashmore ★ 3.5k

2

Entering edit mode

It's not really clear to me from the original question that this is what they want. This would give one value for each genome region. I suspect they want "wiggly" values for each genome region. But could be wrong.

ADD REPLY • link 9.1 years ago by Madelaine Gogol 5.3k

Ram · Accepted Answer · 2015-10-09

6

Entering edit mode

9.1 years ago

Joseph Pearson ▴ 480

You should also consider bedtools genomecov, with the -bg option (bg being bedgraph)

http://bedtools.readthedocs.org/en/latest/content/tools/genomecov.html

Edit: After reading comments, and after some experimenting, what I think you want to do is:

bedtools intersect -a myBam.bam -b myRegions.bed > intersected.bam
bedtools genomecov -trackline -bg -ibam intersected.bam > intersected.bedgraph

Bedtools intersect will give only bam reads that overlap your regions (in myRegions.bed)

Then genomecov will draw a bedgraph file (coverage) only for the read depth for your filtered regions.

ADD COMMENT • link updated 2.2 years ago by Ram 44k • written 9.1 years ago by Joseph Pearson ▴ 480

0

Entering edit mode

The problem is that this tool doesn't take a BED file filtering the regions that I'm interested. Therefore, it'd be too time consuming to calculate for all bases in the genome.

ADD REPLY • link 9.1 years ago by scchess ▴ 640

0

Entering edit mode

Fair enough, I didn't catch the subtlety in your question, but James Ashmore did.

ADD REPLY • link 9.1 years ago by Joseph Pearson ▴ 480

1

Entering edit mode

I think this might be what you want to do - genomecov and then pull out the regions you want using bedtools intersect. But kind of hard to tell.

ADD REPLY • link 9.1 years ago by Madelaine Gogol 5.3k

0

Entering edit mode

Thanks. I think your approach is the best.

ADD REPLY • link 9.1 years ago by scchess ▴ 640

0

Entering edit mode

Note: -abam is correct.

ADD REPLY • link 9.1 years ago by scchess ▴ 640