The VCFs in question are here. If I'm not mistaken, this contains basic information for many SNPs in hg38, including rsid, chromosome, position, ref, and alt. My question is just about the formatting of the CHR numbers. They are formatted as follows
NC_000001.11 NC_000002.12 NC_000003.12 NC_000004.12 NC_000005.10 NC_000006.12 NC_000007.14 NC_000008.11 NC_000009.12 NC_000010.11 NC_000011.10 NC_000012.12 NC_000013.11 NC_000014.9 NC_000015.10 NC_000016.10 NC_000017.11 NC_000018.10 NC_000019.10 NC_000020.11 NC_000021.9 NC_000022.11 NC_000023.11 NC_000024.10 NC_012920.1 and so on
If I am not mistaken this simply refers to the 22 autosomes, 2 sex chromosomes, and mitochondrial DNA respectively, correct? What is this format they are in?
Hi
You are correct, these are in the refseq accession number and version format. You can find a list of all the latest accession numbers here. You can also consult then here.