I am using the algorithm CookHLA for my research. As part of its preparation, I need to feed it a bed file representing at least 100 of my samples.
I have made the bed files for 500 samples using samtools and bedtools in a pipeline:
samtools view -@ CPUs -b --reference $REF_FILE $FILE_NAME.cram chr6:29000000-34000000 | bedtools bamtobed -i stdin > ${FILE_NAME}.bed
I then sorted these bed files with the following:
for FILE in * ;
do
echo $FILE
FILE_NAME=$(echo $FILE | rev | cut -c5- | rev)
bedtools sort -i $FILE > $FILE_NAME.sorted.bed
done
Now I want to merge all these sorted bed files into one. I have read into the mergeBed function and intersect offered by bedtools. When running these, I am unfortunately getting the following error:
Error: Sorted input specified, but the file cat_beds.bed has the following out of order record
chr6 28999852 29000003 A00266:357:HFKFMDSXY:4:1449:1127:5791/1 60
Clearly there is a sequence outside of the bounds I specified through samtools. Anyone have advice?
As an addendum, I got that output after a pipeline where I used cat to combine the sorted bed files, then attempted mergeBed