How to correctly use bedtools merge?
0
0
Entering edit mode
22 months ago
Amisha • 0

I have 10 bed files extracted from GEO Database, I want to use bedtools merge option to merge bedfiles and combine overlapping or “book-ended” features in an interval file into a single feature. But after using bedtools merge the output file generated is reduced to just 4KB which earlier was approximately 1.6GB(all 10 files), am I using bedtools merge correctly or is this an error??

bedtools • 1.2k views
ADD COMMENT
1
Entering edit mode

Sorry, but this is impossible to answer without looking at your data. If you have covered virtually all parts of the genome, you could end up with features as big as a chromosome and thus very few resulting features to write to the output.

To test, you could run it with the parameters -c 4 -o collapse, which concatenates all feature names that have been merged into one output. This allows you to see which features have been merged into one (supposing that you have the feature names in the 4th column of your bed files).

ADD REPLY
0
Entering edit mode

This is how my data looks like after using cat command to combine all the 10 files

ADD REPLY
0
Entering edit mode

To learn which features have been merged, create a version of your file that has every feature named individually (e.g. p53_243240)

sort -k1,1 -k2,2n < merge_p53.bed | awk 'BEGIN{OFS="\t"}{print $0,"p53_"NR}' > named.bed

Now merge this file version as described before and the names of all combined features will be retained, so you can check which mergers have been performed:

bedtools merge -c 4 -o collapse -i named.bed > final.bed
ADD REPLY

Login before adding your answer.

Traffic: 1975 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6