Hi,
I would really appreciate if anyone could provide some advice or strategies to address the following problem. We have target exome sequenced 250 patients and after processing the data (MAF <1% and annovar), we have around 270 variants with per sample containing around 8 variants and a single variant present in the samples ranging from 1 to 240. The variants present in higher number of samples have MAF < 0.01 or no MAF thus they have not been excluded from the list.
Is there a way to cluster these variants based on the patient sample it is present in to determine the causal variants? I have patient information like the type of cancer, receptor status, grade and stage?
Is there a R package to use these information to cluster the variants? Kindly provide your suggestions.
It seems you want to link variants to cancer type. If this is the case, look for rare variant association tests in the literature.