When a mapper (like BWA) is used for Whole Human Genome reads, how are the X and Y chromosomes typically treated? In particular, are homologous regions between the two masked out in the Y chromosome to prevent ambiguous mapping in males and nonsensical mapping in females? Or are there other techniques used to resolve these issues?
If masking is used, is there a published definition of the regions available?
Can you provide a reference to any published articles on this subject?
Check out the README for the 1000g reference genome. Read the bottom section. It answers most of your questions.
Perfect! Thanks for the reference - that's exactly what I was looking for.
@lh3 also made this helpful post somewhat recently (thanks!): http://lh3.github.io/2017/11/13/which-human-reference-genome-to-use