I am developing a tool which would take a gene list as input, and the output would be a VCF including fusion variants involving the genes of interest.
I am going to use Mitelman database which has a comprehensive curation of fusions from published literatures. However, the breakpoints are in cytoband format, i.e., 19p13, rather than a genomic coordinate, such like chr1 10099767.
My questions are: - Are there any other comprehensive databases that you would recommend which have precise breakpoints information in genomic coordinates format? (I tried COSMIC, TCGA, the fusion files are as not comprehensive, seems many well-known fusions are missing from those). - If Mitelman turns out to be the most comprehensive database, how would you suggest I can find corresponding genomic coordinates for the fusions in mitelman in a easy way?
Thanks!
Mitelman Database of Chromosome Aberrations and Gene Fusions in Cancer https://ulib.iupui.edu/databases/mitelman-database-chromosome-aberrations-and-gene-fusions-cancer
Thank you Pierre for the answer. I have actually already downloaded the raw data from Mitelman website, but the question now is, how to convert the breakpoints in cytoband format into genomics coordinates, as that's what would be needed for generating a VCF. I think I can download some cytoband annotation file for converting between genomic coordinates and cytobands, but I just wonder whether there is an easier way.
Thanks,
It wasn't an answer, just a hyperlink because I had no idea of what is
"...using is Mitelman which has ..."
Sorry about the confusion, and thanks for providing the hyperlink. I have updated the post by inserting the hyperlink for Mitelman.