Entering edit mode
6.4 years ago
marina.v.yurieva
▴
580
I'm trying to compress my bedGraph file because of the USCS file size limit, and was trying to use bedtools merge to collapse it on the score column but it merges too much (if the features are less than 1bp apart, it merges them together, so in a peak region it merges the whole peak). Is there a tool that collapses every n-bp in a bed file and calculates the mean of the score (column 4)? From my bedGraph:
chr1 11585 11587 0.00465
chr1 11587 11592 0.0062
chr1 11592 11615 0.00775
chr1 11615 11631 0.0093
chr1 11631 11642 0.01085
chr1 11642 11656 0.0124
chr1 11656 11667 0.01395
If I want to collapse every 30 bps, my output would be:
chr1 11585 11615 0.00775
chr1 11615 11645 0.010075
chr1 11645 11667 0.013175
I know that I can convert bedGraph to bigWig but I'd like to keep bedGraph format and just decrease its resolution if it's possible.
I think you have two options:
bamCoverage
with the -bs parameter that allows to set a bin size that the reads are aggregated over for your bedGraph, orgzip
to reduce its sizeI tried bamCoverage. It worked really well, only had to play with the bin sizes a few times. Thank you!
Why would you like to keep it in bedGraph format? That's a really annoying format to deal with if all you want is to visualize data.
I don't really have a free host website to share the files on, so it's much easier to upload it to GB...
Do you rely on UCSC? THere are alternatives like the IGV, that can read bigwigs from disk.
Right, but I need to upload the data and share it with my boss, UCSC is the easiest way to do that. If it was for just myself, wouldn't been a problem...
Do you need bins of equal size for some subsequent analysis, or only for compressing your file?
No, I need just bins for the compression