Question

How to know important SNPs?

0

Entering edit mode

6.9 years ago

khn ▴ 140

Hello,

I have merged two datasets, using plink. One of the datasets have much less SNPs (dataset1; 470000 SNPs per choromosome, and dataset2; 170000 SNPs per chromosome), so I analyzed, using the common SNPs in both datasets.

Compared to the results before merging two datasets, the results were different. Some SNPs that were statistically significant became not statistically significant anymore, and some SNPs became statistically significant. I think it is because we added some patients when merging, and also I anlayzed only with common SNPs in both datasets.

Then could I just say that the results after merging are fine? Or shoud I check whether I am not missing some important SNPs which became not statistically significant after merging though that were statistically significant before merging. If so, what should I need to do?

Thank you in advance!

SNP snp gene • 1.8k views

ADD COMMENT • link 6.9 years ago by khn ▴ 140

1

Entering edit mode

You haven't told us anything about the number of individuals, the purpose of your analysis, the statistical test you performed and the technology used to obtain the data, which are potentially important parameters. So please elaborate.

ADD REPLY • link 6.9 years ago by WouterDeCoster 48k

0

Entering edit mode

Thank you for the reply.

The number of individuals,, Dataset 1 -- 2000 (case and control), Dataset 2 -- 5 (case only)
The purpose of your analysis Case control analysis
The statistical test you performed Fisher
The technology used to obtain the data Plink

Thank you!

ADD REPLY • link 6.9 years ago by khn ▴ 140

1

Entering edit mode

Well, then you are removing an awful lot of variants to just add 5 more individuals. I would probably just test in Dataset 1, and then see if those variants were called in Dataset 2.

ADD REPLY • link 6.9 years ago by WouterDeCoster 48k

0

Entering edit mode

Thank you - I will compare the two results! But actually I need to add the five cases...

ADD REPLY • link 6.9 years ago by khn ▴ 140

1

Entering edit mode

If we have to add those 5 cases, then run your tests with PCs for the merged data.

ADD REPLY • link 6.9 years ago by zx8754 12k

0

Entering edit mode

Thank you so much. You mean to test with population controls?

ADD REPLY • link 6.9 years ago by khn ▴ 140

1

Entering edit mode

I meant principal components, here is a good info: In genome wide association studies, what are principal components?.

ADD REPLY • link 6.9 years ago by zx8754 12k

0

Entering edit mode

Thank you so much for your reply. Actually all cases and controls in the datasets have been confirmed that they are all in the same population. So if the 300,000 SNPs are reported to be very rare in the population, maybe we do not need to care, but actually it includes not only rare variants.

ADD REPLY • link 6.9 years ago by khn ▴ 140