I have 2 Bed files that i need to merge. I have done so using the following command and it has worked fine.
bedtools merge -c 1 -o count -i ~/Temp_output/peak_count_analysis/Klf3/MACS2/Klf3_ChIP_summits200.bed -i ~/Temp_output/peak_count_analysis/Klf1/MACS2/Klf1_K1ER_pool_summits200.bed > ~/Temp_output/peak_count_analysis/merged/mergedsummits200.bed
However, this results in lost of a lot of useful information for downstream analysis.
Is there a way to merge files and keep the information for those regions? If so, can the merged regions just combine the information into the one column?
Example:
File 1:
chr1 5251857 5252058 Klf1_K1ER_pool_peak_1 13.22945 chr1 9770501 9770702 Klf1_K1ER_pool_peak_2 6.61350 chr1 9773611 9773812 Klf1_K1ER_pool_peak_3 2.72345 chr1 9774350 9774551 Klf1_K1ER_pool_peak_4 40.70829 chr1 9815269 9815470 Klf1_K1ER_pool_peak_5 22.47497
...
File 2:
chr1 6204622 6204823 Klf3_ChIP_peak_1 0.88333 chr1 7078830 7079031 Klf3_ChIP_peak_2 19.91139 chr1 7388243 7388444 Klf3_ChIP_peak_3 15.39874 chr1 9690724 9690925 Klf3_ChIP_peak_4 7.17301 chr1 9738376 9738577 Klf3_ChIP_peak_5 8.30267
...
Hopeful outcome:
chr1 6204622 6204823 2 Klf3_ChIP_peak_1; Klf1_K1ER_pool_peak_5 chr1 9770501 9770702 1 Klf1_K1ER_pool_peak_2
...
I need to eventually convert the merged file to a gtf and would like to retain peak information.
Thanks in advance.
Ah. that's what i thought i might need to do, but wasn't sure how to do it. thank you.
ok. i see what you are doing there.
Hi again,
For some reason, when i sort it with the options you give (specifically the -k2,2n) bedtools merge cannot open the file.
I don't understand this.
to actually get it sorted in the correct order I need the options: -k1,1V -k2,2n (your options sorte Chr 1, Chr 10, Chr 11..). If i do it with just -k1,V1, bedtools and open it but it's not sorted by start properly, but it can at least open it.
annoyingly, once i get it sorted properly, it won't open it but i can't tell what is wrong with it to fix it!
any help is greatly appreciated.