Hi everyone,
This is my first time posting here, and I'm still an undergrad who has just started learning these things a month ago, so I really sorry if my question is naive and confusing.
I'm having multiple coverage files read from Bismark on transposon. The format looks something like this (tabs-delimited, 1-based genomic coords):
<chromosome> <start position> <end position> <methylation percentage> <count methylated> <count unmethylated>
And one my files (A1) look likes this
18 3000087 3000087 50 1 1
18 3000093 3000093 50 1 1
18 3000104 3000104 0 0 1
18 3000163 3000163 0 0 1
However, because I'm sequenced the same sample on different lanes, in one of my other files (A2), there's a line like this:
18 3000087 3000087 100 1 0
This line contain the same loci information as the last files, and because A1 and A2 are sequenced from the same samples but different lanes, I want to merge it to become the final files (A) that will be something like this:
18 3000087 3000087 50 2 1
I'm not too sure if there's anyway to do it through the terminal, so I can read it in methylKit and continue my downstream analysis. If anyone has any suggestions, please let me know.
Thank you and have a great week!
bedtools merge
: https://bedtools.readthedocs.io/en/latest/content/tools/merge.htmlI have tried with BEDtools before, but it merged both book-ended loci and overlapping loci, while I only need to go by overlapping loci. I wonder if there's a parameter that I can add to exclude the function, because I have tried to set the -d to 0, and it resulted to what I have just told you.
Thank you for your quick response!