Hi all,
I am dealing with the SNVs called from samples from a family.
Based on known genetic model, we finally find some candidate SNVs which might have associations with the disease.
However, I still find some ambiguous SNVs.
The SNVs were detected in a homologous region, which have nealy 3 100% same hits across the genome and lead to multiple hit to bwa mapping.
My questions is shall I use unique mapping or remove these SNVs in the final annotation stage?
We could always find some SNVs in low coverage, such as we only find 5~10X depth in the SNV position.
My question is in bwa+gatk pipeline, shall we define a coverage threshold to filter these low coverage SNVs?
In GATK pipeline, it is recommended to carry out the base recal. based on the known indel databases. By comparing the results with/without this step, we could find that some indels could detected after the recal. process, but with lower coverage and multiple hits.
My question is whether this step is necessary under Mendelian disease and family samples(only 3 samples, F/M/C)?
Thanks.