Bcftools consensus unexpected output
0
0
Entering edit mode
5 months ago
selplat21 ▴ 20

Hello,

I am currently trying to call a consensus sequence for a genome that is 1.2 Gbp in length.

I have a VCF file that has only ~7 million records, including variant and invariant sites.

When I run bcftools consensus Test.vcf.gz -f Reference.fasta -a N;

I get an output consensus with ONLY 5% N content.

This is confusing to me because I wouldn't expect 7 million records to contain enough information to get a consensus for a genome of this size.

I have confirmed on a test file that consensus is working as expected and not including reference bases that are not covered in the VCF file.

Any idea what could be going on?

Bcftools • 375 views
ADD COMMENT
0
Entering edit mode

You should see @Michael's answer in this thread for caveats as to why this method should not be used : Generating consensus sequence from bam file

You can use samtools consensus as recommended by @Jkbonfield (developer of samtools/bcftools) in thread above.

ADD REPLY
0
Entering edit mode

This is a consensus generated from a VCF file, not a bam file. Does that change much?

ADD REPLY
0
Entering edit mode

I am going to delete my post since it is not applicable in your case. I should have checked the command more carefully. Apologies.

ADD REPLY

Login before adding your answer.

Traffic: 1590 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6