How to know important SNPs?
0
0
Entering edit mode
6.5 years ago
khn ▴ 130

Hello,

I have merged two datasets, using plink. One of the datasets have much less SNPs (dataset1; 470000 SNPs per choromosome, and dataset2; 170000 SNPs per chromosome), so I analyzed, using the common SNPs in both datasets.

Compared to the results before merging two datasets, the results were different. Some SNPs that were statistically significant became not statistically significant anymore, and some SNPs became statistically significant. I think it is because we added some patients when merging, and also I anlayzed only with common SNPs in both datasets.

Then could I just say that the results after merging are fine? Or shoud I check whether I am not missing some important SNPs which became not statistically significant after merging though that were statistically significant before merging. If so, what should I need to do?

Thank you in advance!

SNP snp gene • 1.6k views
ADD COMMENT
1
Entering edit mode

You haven't told us anything about the number of individuals, the purpose of your analysis, the statistical test you performed and the technology used to obtain the data, which are potentially important parameters. So please elaborate.

ADD REPLY
0
Entering edit mode

Thank you for the reply.

  1. The number of individuals,, Dataset 1 -- 2000 (case and control), Dataset 2 -- 5 (case only)

  2. The purpose of your analysis Case control analysis

  3. The statistical test you performed Fisher

  4. The technology used to obtain the data Plink

Thank you!

ADD REPLY
1
Entering edit mode

Well, then you are removing an awful lot of variants to just add 5 more individuals. I would probably just test in Dataset 1, and then see if those variants were called in Dataset 2.

ADD REPLY
0
Entering edit mode

Thank you - I will compare the two results! But actually I need to add the five cases...

ADD REPLY
1
Entering edit mode

If we have to add those 5 cases, then run your tests with PCs for the merged data.

ADD REPLY
0
Entering edit mode

Thank you so much. You mean to test with population controls?

ADD REPLY
1
Entering edit mode

I meant principal components, here is a good info: In genome wide association studies, what are principal components?.

ADD REPLY
0
Entering edit mode

Thank you so much for your reply. Actually all cases and controls in the datasets have been confirmed that they are all in the same population. So if the 300,000 SNPs are reported to be very rare in the population, maybe we do not need to care, but actually it includes not only rare variants.

ADD REPLY

Login before adding your answer.

Traffic: 1772 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6