Bcftools consensus unexpected output

0

Entering edit mode

5 months ago

selplat21 ▴ 20

Hello,

I am currently trying to call a consensus sequence for a genome that is 1.2 Gbp in length.

I have a VCF file that has only ~7 million records, including variant and invariant sites.

When I run bcftools consensus Test.vcf.gz -f Reference.fasta -a N;

I get an output consensus with ONLY 5% N content.

This is confusing to me because I wouldn't expect 7 million records to contain enough information to get a consensus for a genome of this size.

I have confirmed on a test file that consensus is working as expected and not including reference bases that are not covered in the VCF file.

Any idea what could be going on?

Bcftools • 375 views

ADD COMMENT • link updated 5 months ago by Ram 44k • written 5 months ago by selplat21 ▴ 20

0

Entering edit mode

~~You should see @Michael's answer in this thread for caveats as to why this method should not be used : Generating consensus sequence from bam file~~

~~You can use samtools consensus as recommended by @Jkbonfield (developer of samtools/bcftools) in thread above.~~

ADD REPLY • link updated 5 months ago by Ram 44k • written 5 months ago by GenoMax 147k

0

Entering edit mode

This is a consensus generated from a VCF file, not a bam file. Does that change much?

ADD REPLY • link 5 months ago by selplat21 ▴ 20

0

Entering edit mode

I am going to delete my post since it is not applicable in your case. I should have checked the command more carefully. Apologies.

ADD REPLY • link 5 months ago by GenoMax 147k

Login before adding your answer.