If you have a set of coordinates to a specific genome build, are there any decent tools/approaches for porting those coordinates to a new build?
If you have a set of coordinates to a specific genome build, are there any decent tools/approaches for porting those coordinates to a new build?
For UCSC data just follow the suggestions here:
http://genome.ucsc.edu/FAQ/FAQdownloads.html#download28
Although to be honest it seems more common to just stick to one assembly for the duration of the work (yes that is anecdotal experience) but if you always reference the build you use throughout the data release and publication process I don't see this being a problem (of course YMMV if what you are working on varies greatly between builds).
Use LiftOver http://genome.ucsc.edu/cgi-bin/hgLiftOver
I also remember that I used the mapping information of dbsNP or UniSTS for both build to find how a genomic segment 'moved'.
Moreover, if you store this information in a DB, or in any file, always include a column for the build or else your data will be a mess when you later have a look at it.
Pierre
The galaxy platform http://main.g2.bx.psu.edu/ integrates LiftOver and it is excellent for working with coordinate based data: not least because the data itself can be assigned species and build information in the metadata.
A new NCBI service was announced today: "NCBI Genome Remapping Service"
I was just going to mention that! Here's the tweet if anyone wants to re-tweet it: http://twitter.com/NCBI/statuses/26097191403
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
This post details various options for coordinate conversions between builds: Converting Genome Coordinates From One Genome Version To Another (Ucsc Liftover, Ncbi Remap, Ensembl Api)