Entering edit mode
21 months ago
Sebastien_Vigneau
▴
10
What would be a good tool or collection of tools to calculate a confidence score for each nucleotide in a genome assembly using short-read data, ideally taking into account both the reads pile-up and each read's sequencing quality score, and able to handle SNPs and INDELs?
Hi, I do not really know what you mean by "confidence" score. Would it be a measure of the probability of misassembly at a specific position? Assemblies are often assessed as a whole and not in a per-base manner.
It may not be exactly what you want, but Pilon has a
--vcf
parameter to produce a.vcf
file listing detailed information about base and indel evidence at every base position in the genome. See: https://github.com/broadinstitute/pilon/wiki/Output-File-Descriptions#vcf