Hello,
I am new to whole genome sequencing analysis and I am looking for suggestions on appropriate analysis strategies. I recently discovered that geneX is a predictive biomarker for a rare neurological disorder. We used single cell, bulk RNA sequencing, and proteomics data to identify and confirm the nature of geneX as a predictor of an adverse event. I am now working on identifying variants associated with geneX in 15 samples. Since the disease is a rare disease we had access to only 15 samples that could be sequenced. I used the GATK best practices to process the WGS data and create a VCF file representing the variations across all 15 samples. I further used VEP to add annotations to the VCF file. The annotations included allele frequency using gnomAD, consequence, SIFT, polyphen, etc. I have filtered the variants to only retain low allele frequency variants (<0.03). For gene X, I see there are a few variants present across most samples, these are not consistently present at a single location on geneX but they are present throughout. I would like to know if I could perform any further analysis. Since the sample size is low, I am not sure if association testing would get me anywhere.
I appreciate all comments, suggestions, and references.
Thanks!