Difference In Genome Builds
3
1
Entering edit mode
13.7 years ago
Dataminer ★ 2.8k

Does difference in gene build, can influence ones analysis?

Genome builds are always being updated, so an analysis performed on an older build is of any significance?

example: analysis performed 2 yrs ago based on build hg 16 will bear any significance today?

genome • 5.1k views
ADD COMMENT
5
Entering edit mode

Please note that a gene build is not the same as a genome build. A gene build is a set of gene annotations. A genome build is another word for a genome assembly (for human e.g. GRCh37/hg19, NCBI36/hg18 etc.). As mentioned by Ian it is indeed very important to know on which genome assembly your data are annotated. You can map coordinates between builds with the UCSC liftOver tool or the Ensembl Assembly converter (http://www.ensembl.org/tools.html).

ADD REPLY
0
Entering edit mode

Thank you, for correction.

ADD REPLY
3
Entering edit mode
13.7 years ago

The coordinates have changed between the two builds. The following mysql query shows that only 2785 SNPs, all mapped on chr17, have the same coordinates between hg18 and hg19.

mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -D hg18

mysql> select A.chrom,count(*) from hg18.snp130 as A, hg19.snp130 as B where A.weight=1 and B.weight=1 and A.name=B.name and A.chrom=B.chrom and A.chromStart=B.chromStart and A.chromEnd=B.chromEnd group by A.chrom;
+-------+----------+
| chrom | count(*) |
+-------+----------+
| chr17 |     2785 | 
+-------+----------+
1 row in set (8 min 29.72 sec)

but the other SNPs have been mapped to another coordinates:

mysql> select A.chrom,B.chrom,count(*) from hg18.snp130 as A, hg19.snp130 as B where A.weight=1 and B.weight=1 and A.name=B.name and NOT(A.chrom=B.chrom and A.chromStart=B.chromStart and A.chromEnd=B.chromEnd) group by A.chrom,B.chrom;

ADD COMMENT
2
Entering edit mode
13.7 years ago
Ian 6.1k

I may have misread your question, but it is a very important point that genome coordinates can change between genome builds, e.g. hg17, to hg19. However, you can use the UCSC liftOver tool to swap between builds.

One should always check what genome build was used for a particular dataset, especially published data.

ADD COMMENT
2
Entering edit mode

I would say it is still valid. The only caveat is the there may be regions of the genome that have been refined/removed etc. in newer builds.

ADD REPLY
0
Entering edit mode

The point which I want to put forward, the research carried out on older genomic co-orfdinates is still valid or not.

ADD REPLY
1
Entering edit mode
13.7 years ago

the conclussions of an analysis shouldn't be that different using different genome builds if dealing with genes, since these are fairly conserved regions which should be well covered from one genome build to other. you may find difficulties with intergenic regions, where insertions/deletions/swaps/... may be detected or removed by new genome updates.

what it definitely would change from one build to another would be the annotation coordinates, if those coordinates are genomic. if you were working with cDNA coordinates on your previous experiment you may be lucky enough not to have them changed, although our experience is that updating genome builds almost always imply updating all the annotation we previously had.

ADD COMMENT

Login before adding your answer.

Traffic: 1802 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6