Does someone has experience in filtering somatic variants from SVtyper-genotyped VCF files produced with lumpy (WGS, 30-50x)? Lumpyexpress was run as instructed in the manual, and the VCF was genotyped with SVtyper, everything on default settings. There are some issues on Git on how to discriminate somatic from germline variants, but still I did not really make progress in how to filter out the somatics. Anyone experienced in this?
EDIT: Ryan Layer was so kind to give this response on Github:
Select the variants that are non reference in your tumor and have no evidence in the normal. You can use SnpSift to do this with something like: GEN[0].GT != 0/0 && GEN[1].AO == 0 The syntax is not exactly right. Check here for the exact details.
Which with proper SnpSift syntax would be the following, given that the tumor column comes before the normal column:
In the absence of matched 'normal' DNA, such as that from leukocytes in the plasma buffy coat or buccal swab DNA, Why not just build your own 'in house' database of normal DNA by downloading all 1000 Genomes Phase III FASTQ files, processing them, and then creating an easy lookup in order to filter out all likely germline variants?
From what I've seen so far, the major cancer centers (in the USA) each has their own 'panel of normals', which they use for filtering.