How to get the SNP statistics for Hg19
1
I need to perform some mathematical analyses on human SNPs. I need information such as the number of SNPs in each chromosome, and also shared number of SNPs between different ethnicities etc. How can I obtain such information?
SNPs
• 1.6k views
number of SNPs in each chromosomes
using ucsc / hg19
$ mysql --user=genome --host=genome-mysql.soe.ucsc.edu -A -D hg19 -e 'select chrom,count(*) from snp147 group by chrom'
+-----------------------+----------+
| chrom | count(*) |
+-----------------------+----------+
| chr1 | 11923447 |
| chr10 | 7118919 |
| chr11 | 7361809 |
| chr11_gl000202_random | 66 |
| chr12 | 7060356 |
| chr13 | 5018535 |
| chr14 | 4787993 |
| chr15 | 4423574 |
| chr16 | 4993158 |
| chr17 | 4456328 |
| chr17_ctg5_hap1 | 54172 |
| chr17_gl000203_random | 278 |
| chr17_gl000204_random | 162 |
| chr17_gl000205_random | 3903 |
| chr17_gl000206_random | 15 |
| chr18 | 3984745 |
| chr18_gl000207_random | 62 |
| chr19 | 3654726 |
| chr19_gl000208_random | 2215 |
| chr19_gl000209_random | 320 |
| chr1_gl000191_random | 35 |
| chr1_gl000192_random | 275 |
| chr2 | 12691716 |
| chr20 | 3361527 |
| chr21 | 2016071 |
| chr21_gl000210_random | 4 |
| chr22 | 2076346 |
| chr3 | 10453846 |
| chr4 | 10046674 |
| chr4_ctg9_hap1 | 34359 |
| chr4_gl000193_random | 2989 |
| chr4_gl000194_random | 3602 |
| chr5 | 9362401 |
| chr6 | 8975248 |
| chr6_apd_hap1 | 146264 |
| chr6_cox_hap2 | 312212 |
| chr6_dbb_hap3 | 269424 |
| chr6_mann_hap4 | 264315 |
| chr6_mcf_hap5 | 256896 |
| chr6_qbl_hap6 | 291958 |
| chr6_ssto_hap7 | 264812 |
| chr7 | 8467782 |
| chr7_gl000195_random | 6219 |
| chr8 | 8105834 |
| chr8_gl000196_random | 2 |
| chr8_gl000197_random | 1 |
| chr9 | 6487426 |
| chr9_gl000198_random | 1995 |
| chr9_gl000199_random | 2891 |
| chr9_gl000200_random | 20 |
| chr9_gl000201_random | 63 |
| chrM | 1579 |
| chrUn_gl000211 | 1536 |
| chrUn_gl000212 | 1793 |
| chrUn_gl000213 | 25 |
| chrUn_gl000214 | 1476 |
| chrUn_gl000215 | 33 |
| chrUn_gl000216 | 567 |
| chrUn_gl000217 | 165 |
| chrUn_gl000218 | 2879 |
| chrUn_gl000219 | 1911 |
| chrUn_gl000220 | 2556 |
| chrUn_gl000221 | 1223 |
| chrUn_gl000222 | 104 |
| chrUn_gl000223 | 12 |
| chrUn_gl000224 | 1793 |
| chrUn_gl000225 | 1437 |
| chrUn_gl000226 | 456 |
| chrUn_gl000227 | 6 |
| chrUn_gl000228 | 29 |
| chrUn_gl000229 | 124 |
| chrUn_gl000230 | 694 |
| chrUn_gl000231 | 171 |
| chrUn_gl000232 | 403 |
| chrUn_gl000233 | 511 |
| chrUn_gl000234 | 373 |
| chrUn_gl000235 | 198 |
| chrUn_gl000236 | 1 |
| chrUn_gl000237 | 140 |
| chrUn_gl000238 | 21 |
| chrUn_gl000239 | 135 |
| chrUn_gl000240 | 218 |
| chrUn_gl000241 | 164 |
| chrUn_gl000242 | 1 |
| chrUn_gl000243 | 128 |
| chrUn_gl000244 | 1 |
| chrUn_gl000245 | 7 |
| chrUn_gl000246 | 3 |
| chrUn_gl000247 | 36 |
| chrUn_gl000248 | 3 |
| chrUn_gl000249 | 1 |
| chrX | 5675522 |
| chrY | 375657 |
+-----------------------+----------+
Login before adding your answer.
Traffic: 1562 users visited in the last hour
maybe this particular data is already described in the literature. I usually query locally all 1000g's vcf files (ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/) to get similar metrics. I find
bcftools
very helpful.Thank you, I will try this.