Dear all,
we have received several exome sequences from our collaborators for patients with a polygenic disease - control samples for supervised analysis are supposed to be provided later, but the collaborators have asked us whether we can already provide some unsupervised analyses of the patient samples in the meantime. The only idea I have so far for unsupervised analysis is to cluster samples using pre-filtered variants for different quality cut-offs, to see whether a robust hierarchical or partition-based sample clustering can be obtained, or to look for cluster patterns at the level of pathways. However, I expect these clustering patterns to be inferior to standard microarray transcriptomics clustering and wondered whether there might be other unsupervised analyses of sequencing data (beyond simple quality checks) that can provide more insightful results than just a sample clustering.
Any suggestions would be greatly appreciated.
Perhaps they are asking for some summary statistics on important features for the covariates and the sequencing results. Of course clustering is one of them. What about metrics such as coverage, and NGS statistics, sorry it's just too broad of a request to give a precise answer, but I hope this can help.