No variant in the Pseudoautosomal regions of gomad chrY ?
1
1
Entering edit mode
13 days ago

Hi all,

There is no variant in the PAR regions of the chrY in gnomad.

$ wget -qO - https://storage.googleapis.com/gcp-public-data--gnomad/release/4.1/vcf/genomes/gnomad.genomes.v4.1.sites.chrY.vcf.bgz |\
bcftools view --no-header --targets "chrY:10001-2781479,chrY:56887903-57217415" |\
wc -l


0

and on my side, unless I'm wrong, I don't have any variant in those regions using my WGS data (contains males+females). why ? is it highly conserved ? do you have any reference please ?

chrY PAR vcf gnomad Pseudoautosomal • 509 views
ADD COMMENT
3
Entering edit mode
13 days ago
GenoMax 151k

GPT says the following. Until we get an answer this seems logical.

There are actually variants in the pseudoautosomal regions (PARs) of the human genome, but they’re relatively fewer and behave differently than variants in other parts of the genome. Here's why that might seem like there are "no variants":

1. High Sequence Identity Between X and Y

PAR regions are identical (or nearly identical) between the X and Y chromosomes. This makes it technically difficult to map sequencing reads accurately to the right chromosome in these regions, especially with short-read sequencing. As a result, many variants may be missed or misassigned.

2. Recombination Keeps Them Homogenized

PAR regions are subject to regular recombination during male meiosis—just like autosomes. This keeps the sequences in these regions more homogenized between the X and Y chromosomes, which reduces the accumulation of unique variants over time.

3. Strong Purifying Selection

Because PAR genes are often dosage-sensitive (they're expressed from both sex chromosomes, unlike other X-linked genes that undergo X-inactivation), deleterious variants are more strongly selected against, reducing variation.

4. Reference Bias and Database Gaps

Some variant databases and genome builds underrepresent or underreport variants in PARs because:

  • They exclude Y-PAR variants.
  • They collapse PAR sequences into a single representation.
  • Or they filter them out due to uncertainty in read mapping.
ADD COMMENT
2
Entering edit mode

Ah, somehow your answer is right, in the reference the PAR on chrY is masked:

$ samtools faidx red.fasta "chrY:10001-2781479" | tail -n +2 | fold -w 1 | uniq -c
2771479 N
ADD REPLY
0
Entering edit mode

GenoMax yeah, I asked GPT too but I don't think it's right (?). The reads are mapped using bwa so even if the regions are similar between X and Y, there should be 50% in X and 50% in Y (?) so some variants should be found in X and Y.

On my side, I mapped some reads using BWA+GATK-HC, and there is no variant in the PAR of Y without any special processing.

ADD REPLY
0
Entering edit mode

Will delete then. See if you get any good answers directly. Most of the papers that GPT used as source appear to be older (~ mid-2000s).

ADD REPLY

Login before adding your answer.

Traffic: 1625 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6