Hello
what is the difference between hg18 and hg19?
Thanks
Sara
Hello
what is the difference between hg18 and hg19?
Thanks
Sara
These are the names/versions of human genome references as used by UCSC browser. They are generally counterparts of NCBI 36 and 37. The current one is hg19 (Human Genome version 19). http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/
if you are asking for the difference in the content, hg18 (or NCBI36, hg18 is just the UCSC's nomenclature) is an older version of the human genome from ~2006, and hg19 (or GRCh37, hg19 is just UCSC's nomenclature again) is the a newer one which I'm almost certain that it was first released on ~2009, although the ongoing work doesn't seem to have ended and subversions have been published since then. if you want to check yourself for differences, you can go to the GRC human genome website and play with the assembly combo box, watching how the gaps are being covered from older to newer versions.
but as I have read on other answer's comments, if you are asking this because you are considering your options for a short-read alignment, I would definitely not go for hg18. don't forget that what you sequence is real genome, and in order to obtain the most accurate alignment you should always use all the genome knowledge available, the most accurate template for your alignments. that is hg19.
If you are generally interested in what changed from hg18 (build36) to hg19 (build37), you can also refer to this discussion here.
Another important difference which I believe first appeared in hg19 was the inclusion of alternate haplotype assemblies for chr6 (7 haplotypes), chr4 (1 haplotype), and chr17 (1 haplotype). This is important because if you are doing an alignment against hg19 and your sequences come from one of these regions with alternate haplotype assemblies you can get a new kind of (apparently) ambiguous alignment where your sequence aligns equally well to chr6 but also chr6_apd_hap1, chr6_cox_hap2, etc. This may cause problems in existing scripts that are not aware of this issue.
https://lists.soe.ucsc.edu/pipermail/genome-announce/2009-April/000161.html
http://genome.ucsc.edu/cgi-bin/hgGateway?hgsid=242782529&clade=mammal&org=0&db=0
The only difference I know between the GRCh37 (b37) build and hg19 is for chrM. See here:
https://lists.soe.ucsc.edu/pipermail/genome-announce/2009-July/000169.html
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
@sara I would strongly recommend to remap the reads instead of running liftover.
Why? And what is the best/most efficient way to do so?
Thanks for reply. I would like to know if i mapped the reads on hg18 and the same at hg19 what will change? Thanks
A great number of your genomic coordinates would change between the two mappings. To convert between the two, you can use the liftOver tool.
I am not aware of a list of all the changes (a diff) between the two builds. One imagines that one can parse the two fasta files to compare and contrast the two builds. You can also use a tool called liftOver from UCSC to "lift" coordinates between the two builds. As liftOver reports its failures, that gives you a rough feeling about if the region pretty much stayed the same or changed (i.e deleted from the other etc)
I agree with @Ih3
it is definitely worth it