Question

Genomic coordinates conversion tools - can we trust them?

1

Entering edit mode

8.3 years ago

armarkes ▴ 10

Hi folks,

Could you please help me with an issue?

I have a huge file with CNV data that was obtained from a GWAS study using PennCNV and other tools, and these variants were mapped to hg18.

Now I would like to convert the genomic coordinates of these predicted CNVs from hg18 to hg19 or hg38; to get an updated annotation. However using different conversion tools, we get strange results such as: one CNV is transformed in two; the genes that were in a CNV disappear in the new annotation... taking in account the limitations of such tools do you believe that it makes sense to convert these results? Or are we transforming data in a way that in the end is not true?

Can you comment on this please? Do you agree that this make sense?

Thanks a lot.

Best regards, Ana

gene genome Assembly • 2.5k views

ADD COMMENT • link updated 8.3 years ago by Brian Bushnell 20k • written 8.3 years ago by armarkes ▴ 10

0

Entering edit mode

We used Remap as well. And we tried liftover from UCSC.

The concern that I have is that if we are misleading the data by doing this.

In Remap for example, there are some CNVs that are converted in different CNVs; one is transformed in two (with flags such as "first pass"; "second pass"). It makes sense to keep only "first pass"?

ADD REPLY • link 8.3 years ago by armarkes ▴ 10

0

Entering edit mode

Please use the "Add a comment" or "add reply" button to post a reply rather than posting a reply as an answer. There's a huge, tempting box at the bottom, but that's only for answers that resolve the initial question.

ADD REPLY • link 8.3 years ago by Brian Bushnell 20k

score 2 · Answer 1 · 2017-03-02

I have written tools to translate variants from one genome version to another, using liftover files. They work very well 99% of the time. If a human has 3 million variants, the 1% that didn't work well are still 30,000 that you need to chase down and individually discover that they are just due to fundamental differences in the reference that cannot be compensated by liftover.

Again, liftover is great 99% of the time. But if you want accurate results, you absolutely need to map all reads to the same reference, and call variants using the same reference. You will never get the same results using this (optimal) methodology as compared to using liftover. Liftover is not bad - it's very useful in some circumstances - but it is never optimal.

score 0 · Answer 2 · 2017-03-02

0

Entering edit mode

8.3 years ago

sbk ▴ 60

Hi Ana,

I have used the NCBI's remapping service (https://www.ncbi.nlm.nih.gov/genome/tools/remap) quite a few times and I have had good results each time. If you can mention the tool you used to convert then may be people can share their experience with it.

ADD COMMENT • link 8.3 years ago by sbk ▴ 60