I am attempting to utilize bedtools merge
(which I have used many times before), to merge overlapping elements in a BED file and I am getting the error:
"ERROR: file has non positional records, which are only valid for the groupBy tool."
I have:
- sorted the bed file by chromosome and start position
- checked the chromosome column for any non-standard notation
- checked the start and end columns for any non-numerical characters
- checked that the start values are all smaller than the end values
None of the above items seem to be an issue. The file has almost 2 million lines, so checking the file manually is not an option.
Any suggestions for how to remedy this issue would be greatly appreciated.
Here is the head of the file:
chr1 11874 12227 DDX11L1 NR_046018.2 +
chr1 12613 12721 DDX11L1 NR_046018.2 +
chr1 13221 14409 DDX11L1 NR_046018.2 +
chr1 14362 14829 WASH7P NR_024540.1 -
chr1 14970 15038 WASH7P NR_024540.1 -
chr1 15796 15947 WASH7P NR_024540.1 -
chr1 16607 16765 WASH7P NR_024540.1 -
chr1 16858 17055 WASH7P NR_024540.1 -
chr1 17233 17368 WASH7P NR_024540.1 -
According to the GitHub code, that error gets triggered by either "Enforce integer coordinates" and "Enforce tab-separated files", maybe the file(s) are not tab delimited or there is a weird coordinate in there
Thank you for the help. It turns out that my python code that was generating the input file was adding a space to the front end of each integer (in addition to the tab separators).
Sometimes it easiest just to subset the file until you can find the offending line.