Entering edit mode
3.6 years ago
a_bis
▴
40
Hi everyone,
I've been trying to convert a .wig file to a .bed file using bedops' wig2bed
function and I get the following error:
Row begins with a tab or space at line 1 in -.
I have tried running it with and without the --zero-indexed
option, as well as alternatively running convert2bed
instead, but the error message remains unchanged. I'm not sure what the error message refers to, as line 1, as far as I can see, doesn't begin with a tab or space:
Is there anything apparently wrong with the format of the .wig file that's leading to this error message, and can you suggest a way to fix it? Thank you!
Can you please cut and paste the output of:
Replace
in.wig
with the filename of your wig file.The output looks as follows:
Also, thank you for your second suggestion about --multisplit. Is there a way to tell if a wig file contains multiple sections? (This one was generated by averaging three bigwig files using WiggleTools -- I'm not sure what sections it would be split into.) Thanks!
If a wig file contains multiple sections, it will have multiple
track
and/or other header lines. You could count how many sections there are minimally via the following command or similar:If you get a value greater than one, you may have multiple sections, and you could investigate the file with a text editor like
emacs
orvi
, etc. to confirm.To get back to your original question, the above does not look like a wig ("wiggle") file, but a bedGraph file.
It appears to be missing
track
and other header lines that specify whether it is variable or fixed width (for example):• https://genome.ucsc.edu/goldenPath/help/wiggle.html • https://genome.ucsc.edu/goldenPath/help/bedgraph.html
To convert bedGraph to sorted, five-column BED format, you can just add a placeholder in the fourth column:
Tangentially, this might be related to your problem, but there is a bug in the UCSC toolkit, where a bedGraph file will be written into a bigWig file as-is, and not first converted to wig format:
• http://genome.ucsc.edu/goldenPath/help/bigWig.html#optional
I don't know if you are perhaps starting with a bigWig file. If so and if it contains a bedGraph file, and if you use
bigWigToWig
, it will not create a wig file as output, but a bedGraph file.UCSC choosing to ignore its own specifications is a bit frustrating. This might not be the issue you're running into; I'm only mentioning it here in case this might be the real cause.
Thank you very much for the advice! I'm not quite sure what did it in the end, but I tried the following on my WiggleTools-generated .wig file:
and the code ran successfully. That is, until the chromosome names started not being recognised and I started getting
messages. I eventually changed all instances of "MT" to "chrM", "X" to "chrX", "Y" to "chrY" and all the "weird" chromosomes such as "GL456213.1" to "chrGL456213" using
sed
. This did the trick andwig2bed
now seems to have worked on the entire file.Thanks for reporting back on the fix.
This tool is fairly old and was written to UCSC's specification for chromosome names, which historically are prefixed with
chr
.I'll add an issue ticket to the Github site referencing this. I can't imagine this being too difficult to generalize for any chromosome name scheme.
Thank you for all your help!
Fixed in the
v2p4p40
branch: https://github.com/bedops/bedops/commit/d9776fdd215e8264b689081c4c7c98482f02b3e2Should be pushed to production in a week or so.
Also, if your wig file contains multiple sections, you may want to add the
--multisplit foo
option.See: https://bedops.readthedocs.io/en/latest/content/reference/file-management/conversion/wig2bed.html