Hello, has anybody generated polyphen 2 scores for structural variants reported by Manta using VEP. I have ran VEP but do not have any scores reported and there is no error in my out put report. Of note I am able to generate these scores for called somatic variants. Thus I am wondering do we need to use a particular VEP function to analyze for structural variants? Thank you
Regarding the question of proteins specifically, most SV that affect a protein coding gene impact gene function pretty strongly. If you think about a glycine-Tryptophan substitution - that could impact protein function. But alteration of dozens to hundreds of AAs, e.g. by losing an exon can result either in gain or loss of function, depending on what is lost; duplicating an exon most commonly results in LoF; alteration of the location of a gene may abrogate its co-expression with certain other genes with untoward effect. In other words, you'd expect a stronger effect than a SNV, on average.
In a broader sense, genotype to phenotype correlation of SVs is an open problem in biology, recently made much more tractable by the advent of 3rd generation sequencing. Briefly, in the era of NGS/short read sequencing, we were 'missing' a substantial proportion of many SV types. This kind of problem made it difficult to comprehensively characterize SVs - after all, we couldn't characterize what we couldn't detect...
Now that 3rd generation sequencing technology (nanopore, SMRT) has improved SV ascertainment, more comprehensive and more accurate predictive models of the functional effects of SVs can be expected in the future. This will likely take the confluence of well-annotated phenotypic data on gapless, phased human genomes AND improved in silico tools, e.g. deep learning to infer likely effects for those that have never before been seen.