Question

Chry In 1000G Vs Hg19

1

Entering edit mode

12.5 years ago

Gabriel R. ★ 2.9k

I looked at the Y chromosome in hg19 and 1000g and they seem to differ despite having the same # of characters. Has anybody noticed this ? Why do they differ ?

genome chromosome • 5.2k views

ADD COMMENT • link updated 12.5 years ago by Neilfws 49k • written 12.5 years ago by Gabriel R. ★ 2.9k

1

Entering edit mode

You should link to the source of the data in each case so we can look at it. However: HG19 is a consensus sequence, 1000G is the sequences from many individuals. So it's not surprising that they differ since the goal of 1000G is indeed to understand variation. There is in fact no single "Y chromosome in 1000g."

ADD REPLY • link 12.5 years ago by Neilfws 49k

0

Entering edit mode

1000g: ftp://ftp.sanger.ac.uk/pub/1000genomes/tk2/main_project_reference/ UCSC: http://hgdownload.cse.ucsc.edu/goldenpath/hg19/chromosomes/

ADD REPLY • link 12.5 years ago by Gabriel R. ★ 2.9k

0

Entering edit mode

OK, now I see that you are referring to the reference sequences used by the 1000G project.

ADD REPLY • link 12.5 years ago by Neilfws 49k

1

Entering edit mode

You should at least point out one base-pair difference to support your argument. So far as I know, they are the same. EDIT: I was wrong. They are different. We should use the 1000g genome if possible.

ADD REPLY • link 12.5 years ago by lh3 33k

score 7 · Answer 1 · 2012-11-28

7

Entering edit mode

12.5 years ago

Neilfws 49k

1000G uses sequences from Ensembl (see README at location in your FTP link).

It seems that Ensembl has a slightly different procedure for inserting N into the sequence scaffolds. The issue is discussed in this mailing list thread.

ADD COMMENT • link 12.5 years ago by Neilfws 49k

2

Entering edit mode

Edit my own comments. I see. I made the build36 version of the genome for 1000g. At that time, there was this difference. My colleague later told me that UCSC have changed to the Ensembl way since hg19, but UCSC still keeps the pseudoautosomal regions on chrY. This is a wrong decision. I would discourage to use the UCSC genome for the mapping purpose.