bcftools consensus error
1
1
Entering edit mode
7.3 years ago

Hello, I have the following trouble with bcftools.

My goal is to obtain a real-data text file in IUPAC alphabet for my text search algorithm. There are no such files on the Internet (at least I cannot find them), there exist only a few artificial text files made over the IUPAC alphabet. My idea was to create the real-data IUPAC file from VCF file and the reference fasta sequnce using bcftools consensus program. I downloaded the necessary data from 1000genome Project. However, the bcftolls consensus program reports the following error:

Symbolic alleles other than <del> are currently not supported: <cn0> at 9:85501

My question is/are:

  1. Is there any way how to filter the VCF file and get rid of the unsupported alleles?

  2. Is there any publicly available VCF files not containing these (by bcftools cosensus) unsupported alleles?

  3. Is there any publicly available real-data stored as a text file over the IUPAC alphabet?

  4. Is there any tool that can transform VCF files to some primitive text form (e. g. "AGTT{AT, CC, C}ACCT" representing 3 variants: "AGTTATACCT", "AGTTCCACCT" and "AGTTCACCT"?

Thank you very much for any idea that can move me one step further.

Petr.

genome • 3.8k views
ADD COMMENT
4
Entering edit mode
7.3 years ago

I would run the following on your data (before running bcftools consensus) in order to ensure that your VCF/BCF is in good shape:

bcftools norm -m-any VCF.GZ | bcftools norm -Ov --check-ref w -f REF.FASTA > OUT.VCF
  • 1st pipe, splits multi-allelic calls into separate variant calls
  • 2nd pipe, left-aligns indels and issues warnings when the REF base in your VCF does not match the base in the supplied FASTA reference genome

You can also set the ID field in the VCF with a 3rd pipe, if you wish:

bcftools annotate -Ov -I +'%ID' #leaves it as the existing ID

or

bcftools annotate -Ob -x ID -I +'%CHROM:%POS:%REF:%ALT' #sets it to chr:pos:ref:alt

Hope that this helps.

Kevin

ADD COMMENT
0
Entering edit mode

Hi, Kevin, Can you help me to check my case:

How to clean this strange VCF file?

Thanks.

Shicheng

ADD REPLY

Login before adding your answer.

Traffic: 1651 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6