Hello everyone,
I made the somatic variants call on NGS data obtained from tumor and healthy FFPE samples of a particular type of skin cancer. Many variants are found in one or a few samples, but have good quality and coverage values. How should i consider this fact? even if present in only one sample, is the variant to be considered valid? Thanks in advance. Sara
thanks this answer gives me relief again!
certainly, different analyzes are needed to verify the role of the mutation in the tumor.
the variant search is just a way to have a list of variations to focus on, right?
Depends what your project is, but you will hardly (if at all) in any tumor entity find variants that occur in every specimen, given the heterogeneity of cancer types. Lots of somatic mutations you find will probably not even have any function. Function is to be shown by experiment I guess unless you have a huge cohort to robustly correlate it with a phenotype.
I'd say - just be sure that all these mutations come from tumors (otherwise you have a problem with your pipeline such as non-stringent filtering for strand bias variants are artifact regions) and put everything into CGI to understand what's going on https://www.cancergenomeinterpreter.org/analysis
Is each sample from a different person? If so this is totally normal. The only reason this would be concerning would be if you had multiple samples per subject and there is <del>very</del> poor agreement within subject. (very minor edit!)
Some mutations are causal and experience selection pressure inside the patient, but most are neutral and occur in random places. Given the vast space of possible mutations, the likelihood of a random, neutral variant appearing more than once would be very small. (Some variants will be more common for selectively neutral reasons to do with DNA mechanics, chromatin, different mutagenic processes, etc.)
IDK about all skin cancers, but melanoma at least has a very high mutation burden in general, meaning lots of random mutagenesis and lots of neutral variants.
I don't suppose you have matching normal tissue samples (usually blood) per subject? Without that, yes, there are lots of germline variants that "pollute" your results and in general FDR goes way up.
I forgot to specify that the tumor samples have a matched normal and that the germline variants should have been directly excluded by the variant callers (I used VarDict and Mutect2)
so the variants found should be tumor specific but probably not all of them have to do with the tumor, they could be errors caused by sequencing or contamination of the tumor sample, right? even if in this case the purity is quite high (80%)
yes, false positives will complicate your life. One common tool to limit FDR is to ignore variants with minor allele frequency/altfrac below x%, for x~5% .
thanks this answer gives me relief!