Hi!,
I am trying to find the consensus peaks of a 8 samples of chip-sea peaks data. The format is Bed.I am not having any problems with finding the overlapping peak coordinates with thankfully bedtools multiinter function. However, when the functions find the overlaps across 8 samples, the peak values are excluded. I just want to do a normalisation or an averaging for a specific overlapped peak value across the samples.
I am looking for the effect of the transcriptional factors on a chromosomal rearrangements that's why I need those TF binding coordinates.
PS: I have used other overlapping programs such as Homer an bedops. None of them has that value normalisation feature. This cannot be a coincidence?
thank you very much for your help.
Best regards
Tunc
Alex thank you for your reply, I tried
bedmap --mean
and similar mathematical calculations however, I couldnt figure out how to do that for multiple files? Could you help me out with that please?You could use
bedops --everything
orbedops -u
to do a multiset union of regions with signal of interest. Then pass this tobedmap --mean
as the map file, e.g.:Dear Alex, just to confirm the code;
lets say I have 3 samples;
sample1
sample2
sample3
the output of the
$bedops -u * > signals.bed
signal.bed
and lets say regions is the overlapped regions form previous runs with no peak values obviously
and
$bedmap --echo --mean regions.bed signals.bed > result.bed
result.bed
so basically, takes the peak location from the regions.bed and takes the mean values of the overlapped regions then write it as a value?
Sorry, I asked it in a very long way but I wanted to be very sure about the output. Because it will effect my project result very deep. and I apologise for my bad england.
Your three sample files are not true BED, but are bedgraph, because you have signal in the fourth column and not the fifth. The BED specification calls for signal in the fifth column. You can use awk to preprocess your samples into true BED, e.g.,:
Then you can use the fixed file in a bedmap statement. It will work as you expect, taking the mean of the signal of "sample" elements that overlap each region.
The BEDOPS bedmap docs explain this in more detail and include an example that demonstrates its use with generic signal data (DNaseI data, but any numeric signal data works).
Dear Alex, I did not specify my 4th column. Its is the unique peak names (like
peak_2495
). Thank you for your help very much. I owe you very big!You're welcome!