The BEDOPS gff2bed
script converts GFF3 files, not GTF files. I haven't verified that gff2bed
can work on generic GTF files. While the two formats are related, they do differ and you should use (for example) Bill Noble's gtf2bed
conversion script to convert GTF files.
Are you running vcf2bed.py < input.vcf -> vcf.bed
? That hyphen would be a typo, probably — I'm not sure what that would do. You would want to run: vcf2bed.py < input.vcf | sort-bed - > vcf.bed
to ensure the BED output is sorted.
If you really want to skip sorting (let's say that your VCF elements are guaranteed to be lexicographically-sorted), then use vcf2bed.py < input.vcf > vcf.bed
.
Note that it is not guaranteed that BEDOPS tools can provide a correct answer on unsorted BED inputs, so it might be better to use the BEDOPS sort-bed
application, just to be safe and prevent GIGO errors. Or add the --ec
option when using bedmap
or other BEDOPS applications, which enables error-checking (though it is usually much faster to just use sort-bed
to pre-sort the input — you only have to sort once, as BEDOPS apps read in and write out sorted data).
If you still get errors, feel free to post a transcript of what you're doing on the BEDOPS forum, along with links to your GTF and/or VCF inputs.
Also consider running your data through validation scripts — we've uncovered problems in the past with labs releasing data that do not follow specification, which requires extra pre-processing steps to clean up input before it is consumed by our scripts and applications. We've also uncovered problems with our assumptions about specifications and how that translates to a script or application. Biology is messy, but bioinformatics is messier.
OK got it!. The click was to download a RefFlat-genes.bed file from UCSC, then sort it with "sort-bed" and use this in "bedmap". Thanks so much for your help! G.
Since answering this question, we have added a GTF-to-BED conversion script to the BEDOPS suite: http://code.google.com/p/bedops/wiki/gtf2bed