I am currently working on a project to detect variants in related yeast strains, which is simple enough, at least for variant calling :).
However, the PI is interested in genes that have been duplicated, e.g. ribosome genes. This means that uniquely mapping reads to the genome results in zero coverage over the genes/regions of interest.
Has any one done something similar to this?
What would the best mapping strategy be to include duplicated sequence, but also be suitable for variant detection?
Part of me wonders whether this is even possible. But the the worst can think can happen is that reads will be split between duplicates, but a mismatch could lead to a misplaced read...
I am currently using bowtie to obtain --best -k1 reads with other default settings, leading to samtools based variant detection.
BTW reads a colour-space from a SOLiD4.
Thanks!