Entering edit mode
3.8 years ago
JustinZhang
▴
120
I got about 50 bed files from different sources, they are all well-formatted, and contains regions of hg19.
My question is: How to merge them and mark every new region with the times it appears in those files and names of them?
My expected output:
#Chr Start End Frequency SourceFiles
Chr1 123456 567891 8/50 1.bed,2.bed,3.bed...
or
#Chr Start End Frequency 1.bed 2.bed 3.bed ... 50.bed
Chr1 123456 567891 8/50 y n n .... y
I've tried Bedtools but the output is not what i want. Any practical ways to this?
Much thanks!
To beginners like me, please understand the difference among
bedtools Multiinter (Now called multiIntersectBed)
,bedtools merge
andbedtools genomecov
first.And never write a de-novo shell/py/perl script before searching and asking enough, because most of our questions have surely been posed and figured out by others.