GPT says the following. Until we get an answer this seems logical.
There are actually variants in the pseudoautosomal regions (PARs) of the human genome, but they’re relatively fewer and behave differently than variants in other parts of the genome. Here's why that might seem like there are "no variants":
1. High Sequence Identity Between X and Y
PAR regions are identical (or nearly identical) between the X and Y chromosomes. This makes it technically difficult to map sequencing reads accurately to the right chromosome in these regions, especially with short-read sequencing. As a result, many variants may be missed or misassigned.
2. Recombination Keeps Them Homogenized
PAR regions are subject to regular recombination during male meiosis—just like autosomes. This keeps the sequences in these regions more homogenized between the X and Y chromosomes, which reduces the accumulation of unique variants over time.
3. Strong Purifying Selection
Because PAR genes are often dosage-sensitive (they're expressed from both sex chromosomes, unlike other X-linked genes that undergo X-inactivation), deleterious variants are more strongly selected against, reducing variation.
4. Reference Bias and Database Gaps
Some variant databases and genome builds underrepresent or underreport variants in PARs because:
- They exclude Y-PAR variants.
- They collapse PAR sequences into a single representation.
- Or they filter them out due to uncertainty in read mapping.
Ah, somehow your answer is right, in the reference the PAR on chrY is masked:
GenoMax yeah, I asked GPT too but I don't think it's right (?). The reads are mapped using bwa so even if the regions are similar between X and Y, there should be 50% in X and 50% in Y (?) so some variants should be found in X and Y.
On my side, I mapped some reads using BWA+GATK-HC, and there is no variant in the PAR of Y without any special processing.
Will delete then. See if you get any good answers directly. Most of the papers that GPT used as source appear to be older (~ mid-2000s).