Question

Calling a partial consensus sequence when you have low sequence coverage

0

Entering edit mode

2.8 years ago

Wilford203 ▴ 10

I work in a lab where we have sequenced a set of Covid-19 genomes and I am currently in the process of constructing consensus sequences from the read data. I have trimmed the NGS output and performed the alignments, and from there used the command:

samtools mpileup -uf Wuhan1.fasta full_len_sam_header.bam | bcftools call -c | 
vcfutils.pl vcf2fq > full_len_consensus.fastq

to call the consensus sequence and output it into a final fastq file to be viewed. This works perfectly for my alignments that have higher coverage (~90%+), but my reads with poor coverage seem to output empty fastq files. Is there anyway to call a consensus sequence for my low coverage reads? I understand with low coverage much of the consensus cannot be filled, but I would still like to be able to view the areas that are covered in context of the whole genome.

Thanks!

ngs samtools bcftools • 899 views

ADD COMMENT • link 2.7 years ago by Wilford203 ▴ 10

score 2 · Accepted Answer · 2022-02-28

2

Entering edit mode

2.8 years ago

Istvan Albert 102k

You could give the ivar package a try,

https://github.com/andersen-lab/ivar

Manual:

https://andersen-lab.github.io/ivar/html/manualpage.html

If I recall it correctly the consensus method in ivar will even insert Ns for uncovered regions of your consensus.