How does "missing" (3rd column) of sample file in SNPTEST affect the results?
1
2
Entering edit mode
10.6 years ago

I have a question regarding the MISSING column (3rd column) of SAMPLE file for SNPTEST.

In the webpage of SNPTEST, it said that:

The sample file has three parts (a) a header line detailing the names of the columns in the file, (b) a line detailing the types of variables stored in each column, and (c) a line for each individual detailing the information for that individual. Here is an example of the start of a sample file for reference

ID_1 ID_2 missing cov_1 cov_2 cov_3 cov_4 pheno1 bin1
0 0 0 D D C C P B
1 1 0.007 1 2 0.0019 -0.008 1.233 1
2 2 0.009 1 2 0.0022 -0.001 6.234 0
3 3 0.005 1 2 0.0025 0.0028 6.121 1
4 4 0.007 2 1 0.0017 -0.011 3.234 1
5 5 0.004 3 2 -0.012 0.0236 2.786 0

This missing refers the sample call rate of certain number of SNPs.

I wonder how does "missing" affect association results?

When handling big data, you often break down into 22 chromosomes. The missing value varied in each chromosomes.

If "missing" does affect results, what should we use?

If "missing" does not affect results, why SNPTEST require this for analysis?

sample-file snptest missing imputation gwas • 3.3k views
ADD COMMENT
0
Entering edit mode

hi, did you manage to calculate this, I don't know how to calculate the missing for creating a sample file

ADD REPLY

Login before adding your answer.

Traffic: 2169 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6