intersection, Bed file and VCF.
2
0
Entering edit mode
6.7 years ago
jaqx008 ▴ 110

Hello All, I am trying to perform intersection between with my bed file and vcf and I keep getting errors.my bed files is a call of all the genes in the ORFs. The following is the command I am using and the errors.

bedtools intersect -a a.vcf -b b.bed -wa

error

  • Please ensure that your file is TAB delimited (e.g., cat -t FILE).
  • Also ensure that your file has integer chromosome coordinates in the expected columns (e.g., cols 2 and 3 for BED).

I tried to do cat -t a.vcf to TAB it but still get the errors. I will appreciate your suggestions. Thanks in advance.

Bedtools intersect VCF ORFs Variant calling • 5.8k views
ADD COMMENT
0
Entering edit mode
6.7 years ago

You might try BEDOPS bedops --ec:

$ bedops --ec --element-of 1 <(vcf2bed < a.vcf) <(sort-bed b.bed) > answer.bed

The use of --ec with bedops can return useful debugging information about malformed BED data, at the cost of a slower runtime. The --ec operator can be removed once the input is debugged.

ADD COMMENT
0
Entering edit mode

Ok let me try this. Thanks

ADD REPLY
0
Entering edit mode

bedops --ec < a.vcf < b.bed > answer.bed

Error -bash: bedops: command not found

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode
6.7 years ago

It seems the problem is not related to bedtools but rather with your input files.

cat -t a.vcf doesn't fix the problem, rather it tells you whether your columns are tab delimited. With cat -t a.vcf (or cat -t b.bed) you should see on screen columns separated by ^I. Perhaps your columns are separated by spaces instead? If so, you could fix it with tr ' ' '\t' < a.vcf > A.vcf

Also, as the error message suggests, have a look if you coordinates are actually integers. Try to do sort -k2,2n a.vcf to see if the top rows (excluding the header) have integers in column 2.

ADD COMMENT
0
Entering edit mode

realized the error was in my bed file. I made sure all tabs are corrected and ran it again. but got complain about the one scaffold.

bedtools intersect -a a.vcf -b b.bed > file.txt Error: Invalid record in file genoOrfs.bed. Record is scaffold1090size18908_225 18674 18522

It seem fine like the rest to me. Don't know why the error

ADD REPLY
0
Entering edit mode

...Just in case you left out some space-to-TAB conversion, try to do

grep ' ' genoOrfs.bed # This shouldn't print any line.
grep 'scaffold1090size18908_225' genoOrfs.bed | cat -vet # Print the faulty record showing TABs
ADD REPLY
0
Entering edit mode

I tried it, the first command gave an output with Zero byte second gave

scaffold1090size18908_225^I18674^I18522$

Not sure what they mean..

ADD REPLY

Login before adding your answer.

Traffic: 1748 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6