Unsupervised Analysis Methods For Dna Sequencing Data
1
2
Entering edit mode
11.4 years ago
Rainer ▴ 130

Dear all,

we have received several exome sequences from our collaborators for patients with a polygenic disease - control samples for supervised analysis are supposed to be provided later, but the collaborators have asked us whether we can already provide some unsupervised analyses of the patient samples in the meantime. The only idea I have so far for unsupervised analysis is to cluster samples using pre-filtered variants for different quality cut-offs, to see whether a robust hierarchical or partition-based sample clustering can be obtained, or to look for cluster patterns at the level of pathways. However, I expect these clustering patterns to be inferior to standard microarray transcriptomics clustering and wondered whether there might be other unsupervised analyses of sequencing data (beyond simple quality checks) that can provide more insightful results than just a sample clustering.

Any suggestions would be greatly appreciated.

clustering sequencing dna analysis statistics • 2.5k views
ADD COMMENT
0
Entering edit mode

Perhaps they are asking for some summary statistics on important features for the covariates and the sequencing results. Of course clustering is one of them. What about metrics such as coverage, and NGS statistics, sorry it's just too broad of a request to give a precise answer, but I hope this can help.

ADD REPLY
0
Entering edit mode
11.4 years ago
Michael 55k

I have seen PCA applied to SNPs and GWAS like in this paper Paschou et al., there is also the R package SNPrelate. I am not totally sure that this is what you need, but at least it can provide a starting point for further searches.

ADD COMMENT

Login before adding your answer.

Traffic: 2527 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6