Snps Selection/Filtering
1
0
Entering edit mode
11.4 years ago
TitoPullo ▴ 190

I have a dataset with 2 Millions SNPs associated to 1000 individuals. Each SNP is represented, for each individual, as: 0,1 or 2 (the number of minor alleles). I'd like to reduce the number of SNPs in order to use them as attributes with an SVM. Is it a correct approach calculate the r^2 (r squared) for each pair of SNPs and then consider only the ones with a correlation smaller than 0.8 (I found this value as well used in literature)?

snp correlation selection filtering • 2.3k views
ADD COMMENT
0
Entering edit mode
11.4 years ago
Fabio Marroni ★ 3.0k

I think it's the contrary. Usually, people reduce their dataset by REMOVING SNPs with r2>0.8. i.e. if you have two SNPs with r^2>0.8 you remove one of them.

ADD COMMENT
0
Entering edit mode

You're right, I made a mistake writing the post! Anyway is it a biologically correct approach?

ADD REPLY
0
Entering edit mode

I never used this approach, but if you need to reduce the number of SNPs I think that this can be a reasonable approach, but this is just my personal opinion.

ADD REPLY
0
Entering edit mode

Have you ever face with this problem (reduction of the number of SNPs)? If yes which approach did you used?

ADD REPLY
0
Entering edit mode

No, I never had to reduce the number of SNPs. But I think that the most common approach is the one you described. So, my suggestion is to try that way.

ADD REPLY

Login before adding your answer.

Traffic: 2172 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6