Hi I am trying to compare the 1000 genome phase3 GRCh19 vcf file and the GRCh38 (low coverage) vcf file.
I noticed that one sample, NA18498, is present in the GRCh19 file but not in the GRCh38 file. I could not find a document and any other source that confirms this information. Does anyone know why this sample is missing from the GRCh38 vcf file and where I can find more details about it?
I have also noticed that there are 45 more samples in the GRCh38 version file, such as HG00270, HG03398, HG03393. I am not sure why they were introduced
Thank you for your help.
Regrads
Zhang
Resource:
GRCh19 1000 genomes: http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/
GRch38 1000 genomes (low coverage): http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/1000_genomes_project/release/20190312_biallelic_SNV_and_INDEL/
yeah, but GRCh19 seems also only contain unrelated sample, the NA18498 also including in GRCh19 and GRCh38 1000 genomes (high coverage) unrelated panel