No variant in the Pseudoautosomal regions of gomad chrY ?
2
1
Entering edit mode
17 days ago

Hi all,

There is no variant in the PAR regions of the chrY in gnomad.

$ wget -qO - https://storage.googleapis.com/gcp-public-data--gnomad/release/4.1/vcf/genomes/gnomad.genomes.v4.1.sites.chrY.vcf.bgz |\
bcftools view --no-header --targets "chrY:10001-2781479,chrY:56887903-57217415" |\
wc -l


0

and on my side, unless I'm wrong, I don't have any variant in those regions using my WGS data (contains males+females). why ? is it highly conserved ? do you have any reference please ?

chrY PAR vcf gnomad Pseudoautosomal • 643 views
ADD COMMENT
3
Entering edit mode
16 days ago
GenoMax 151k

GPT says the following. Until we get an answer this seems logical.

There are actually variants in the pseudoautosomal regions (PARs) of the human genome, but they’re relatively fewer and behave differently than variants in other parts of the genome. Here's why that might seem like there are "no variants":

1. High Sequence Identity Between X and Y

PAR regions are identical (or nearly identical) between the X and Y chromosomes. This makes it technically difficult to map sequencing reads accurately to the right chromosome in these regions, especially with short-read sequencing. As a result, many variants may be missed or misassigned.

2. Recombination Keeps Them Homogenized

PAR regions are subject to regular recombination during male meiosis—just like autosomes. This keeps the sequences in these regions more homogenized between the X and Y chromosomes, which reduces the accumulation of unique variants over time.

3. Strong Purifying Selection

Because PAR genes are often dosage-sensitive (they're expressed from both sex chromosomes, unlike other X-linked genes that undergo X-inactivation), deleterious variants are more strongly selected against, reducing variation.

4. Reference Bias and Database Gaps

Some variant databases and genome builds underrepresent or underreport variants in PARs because:

  • They exclude Y-PAR variants.
  • They collapse PAR sequences into a single representation.
  • Or they filter them out due to uncertainty in read mapping.
ADD COMMENT
2
Entering edit mode

Ah, somehow your answer is right, in the reference the PAR on chrY is masked:

$ samtools faidx red.fasta "chrY:10001-2781479" | tail -n +2 | fold -w 1 | uniq -c
2771479 N
ADD REPLY
0
Entering edit mode

GenoMax yeah, I asked GPT too but I don't think it's right (?). The reads are mapped using bwa so even if the regions are similar between X and Y, there should be 50% in X and 50% in Y (?) so some variants should be found in X and Y.

On my side, I mapped some reads using BWA+GATK-HC, and there is no variant in the PAR of Y without any special processing.

ADD REPLY
0
Entering edit mode

Will delete then. See if you get any good answers directly. Most of the papers that GPT used as source appear to be older (~ mid-2000s).

ADD REPLY
1
Entering edit mode
2 days ago
cmdcolin ★ 4.2k

On the UCSC hg19 genome, the PAR regions on Y are actually exact copies of X

This is described here

 The Y chromosome in this assembly contains two pseudoautosomal regions (PARs) that were taken from the corresponding regions in the X chromosome and are exact duplicates:

    chrY:10001-2649520 and chrY:59034050-59363566
    chrX:60001-2699520 and chrX:154931044-155260560

https://genome.ucsc.edu/cgi-bin/hgTracks?chromInfoPage=&hgsid=2552728624_MynJElJGClcRCEzJfk04NfcYQ5Ev

and as you found in your other comment, it is masked

as a biological note: this high similarity gives rise to their naming "pseudo autosomal regions"

the PAR regions are so similar, that they can engage in recombination/crossing over (e.g. X can cross over with Y), which makes these regions similar to the autosomes (chr1-22)

ADD COMMENT

Login before adding your answer.

Traffic: 3104 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6