Forgive me for this question, but could anyone explain to me what are the main uses for a consensus sequence? In other words, what kind of research requires the use of consensus sequences?
If you have VCF files do you need the consensus sequence?
Forgive me for this question, but could anyone explain to me what are the main uses for a consensus sequence? In other words, what kind of research requires the use of consensus sequences?
If you have VCF files do you need the consensus sequence?
In the context of NGS analysis, consensus sequences provide regions where we can make confident calls, which is missing from variant-only VCFs. They are mostly useful for popgen. For example, you can compute heterozygosity from a consensus sequence, but you cannot do the same with a variant-only VCF.
I think consensus sequence represents which amino acid or nucleotide appears the most frequent in a position. I have seen it used most during estimating motifs and also after multiple alignment. And most of the splice sites are represented by consensus sequences too.
A typical consensus sequence looks like this:
AG[TC]T[CG]NSA
And I'm not sure about the vcf files question. It depends on the kind of analysis you are doing and why you need it I guess.
A motif usually has a consensus sequence with high information content at most bases. One could use a consensus sequence to demonstrate that a set of genomic regions (obtained via lab or computational methods) constitutes a particular motif.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
So if you have the bam (assembly) files, do you need (or want) the consensus sequence files?
I never needed the consensus files. Which analysis are you working on? Why do you think you need consensus files?