BedTools IntersectBed resulting in Segmentation Fault with BedFile
2
0
Entering edit mode
9.9 years ago
ali • 0

I have a vcf file (sample1.vcf)< and I will like to remove a large list of sites contained in a bedfile

e.g.:

chr22    37536301    37536301
chr22    82722119    82722273
chr18    218879484    218879484
chr18    60949121    60949149
chr13    230905465    230905465

The command I run:

intersectBed -a sample1.vcf -b hg19.sorted.bed -v

results in

intersectBed: line 2: 21108 Segmentation fault      (core dumped) ${0%/*}/bedtools intersect "$@"

How can I fix this?

intersectbed vcf bed bedtools • 11k views
ADD COMMENT
0
Entering edit mode

I hope you read through the rest of the segmentation fault posts on the site. There are issues with unsorted and unindexed files that are explained in many different posts.

ADD REPLY
0
Entering edit mode

yes I have, non of the suggested solutions work

ADD REPLY
2
Entering edit mode

In the meantime, try vcftools --exclude-positions <bedfile> from here: http://vcftools.sourceforge.net/man_latest.html

ADD REPLY
0
Entering edit mode

thanks, working solution

ADD REPLY
5
Entering edit mode
9.9 years ago

If the segmentation fault is due to excessive memory requirements, sort your bed files by chromosome and position (with sort -k1,1 -k2,2n a.bed > sorted.bed) and then use the -sorted option of intersectBed :

If you are trying to intersect very large files and are having trouble with excessive memory usage, please presort your data by chromosome and then by start position (e.g., sort -k1,1 -k2,2n in.bed > in.sorted.bed for BED files) and then use the -sorted option. This invokes a memory-efficient algorithm designed for large files. This algorithm has been substantially improved in recent (>=2.18.0) releases.

Also, a quick check to see if your files are already sorted is to use the -c option in sort:

sort -c -k1,1 -k2,2n a.bed
ADD COMMENT
2
Entering edit mode

For what it's worth, I just got around a bedtools intersect segmentation fault by upgrading bedtools

ADD REPLY
1
Entering edit mode
9.9 years ago

One alternative, using bash redirection:

bedops --not-element-of -1 <(convert2bed -i vcf < sample1.vcf) hg19.sorted.bed > answer.bed

The convert2bed tool applies sort-bed ordering on the converted VCF.

If you don't know the sort status of hg19.sorted.bed (in spite of its name), then you can do similar redirection with that second input to make sure it is sorted:

bedops --not-element-of -1 <(convert2bed -i vcf < sample1.vcf) <(sort-bed hg19.sorted.bed) > answer.bed

The sort-bed application is usually faster than GNU sort at sorting BED files for use with BEDOPS or other tools, but either will work.

ADD COMMENT

Login before adding your answer.

Traffic: 1518 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6