I am using Variant effect predictor to enrich my VarScan results which in turn are a results of tumor-normal variant calling in real human cancer patient data. One goal is to distinguish mutations, that overlap known protein domains and therefore we use --domains argument in VEP. However, so far we haven't been able to get any protein domain information using VEP, even though we have a test data with more than 25 000 missense mutations in known genes. These mutations are correctly annotated with PolyPhen and SIFT (using VEP with corresponding arguments). I think it is not possible to have such a large amount of mutations without a single one of them overlapping protein domains. I'm using VEP in offline mode and I have downloaded the large data sets including PolyPhen and SIFT data. Can you think of any reasons why VEP is unable to find overlapping protein domains?
You are right -- this is a bit strange. I would check few cases and confirm that the variants are on the domains and not on unassigned regions and then contact VEP.
Thanks for your reply. Could you please suggest a procedure for finding variants that occur on protein domains manually. We have obviously large amounts of data and we suppose that most of the variation occurs outside domains.