We have two groups sample of yeast species, control (1 sample) and treatment (1 sample), whose complete reference genome isn't available yet to do alignment nor variant calling. The objective of this project is straightforward, simply wanting to compare whether there are variant differences between the two groups. Since there isn't complete genome reference, we're doing de novo assembly to the one control sample, and then use it as a reference to do alignment, and later variant calling to the rest treatment samples. Fyi, we're using ONT platform to do the sequencing, so we're expecting long reads results.
De novo assembly using flye, and we got 75 contigs. Alignment using minimap2, and found moderate % of mapped, with decent qual map as well (filtered > 50) But once we did variant calling using claire, we found no calls at all, with message shown:
no contig intersection found, output header only in /path/to/vcf.gz
Any thoughts on what cause the no call nor the shown message attached here?
Alternatively we used bcftools to perform the variant calling, and found many variants. Assuming the Q score filtered above 10 and qual score are filtered to certain standard, can we be confident with the calling from the bcftools? ( I've been reading that non-long read aware caller like gatk or bcftools can not mitigate errors prevalent in long read sequencing, so I was thinking if we can be confident with the qscore or the mapping quality, then inherently we should be confident as well with the variant calling, regardless which caller did we use.)
Appreciate your feedback. Thanks.
What does that actually translate to numbers in terms of % alignment? At least the sample from which the assembly was made should align well.
Is there a close relative available? Perhaps you could use it as a reference to see if you get better results. Your own assembly may not be of good quality.