I'm trying to run the 20/20+ tool using a set of somatic mutations that have been called from the hg38 reference genome. Could someone please let me know whether or not the pre-computed scores and the training classifiers (the Rdata) are dependent on the reference genome version?
Thanks a lot for the reply! It helps a lot. I didn't mention about the gene annotation BED file since that's perhaps easier to get. Shouldn't just simple reformatting of any hg38 transcript file to a BED12 format work in addition to replacing the transcript name with gene name and the keeping only the longest transcript by CDS length? It would be great to actually have the hg38 versions of the required files. Looking forward to that. Thanks once again.
The issue is that the precomputed scores are annotated against the transcript chosen in the BED file I provide. So I really do have to provide updated data files for all of them together for it to work. Our lab has updated some of the score information for hg38, but it hasn't been made yet into a format useable by 20/20+.
Thanks a lot! That's helpful to know.