Entering edit mode
3.0 years ago
RocketLeague
•
0
GWAS: I only have information on single allele in every samples. Example: At pos 657, sample 1 has T, sample 2 has C, sample3 has T and so on. To conduct a GWAS, should I consider it as homozygous? Ex. C should be treated as CC for position 657 in sample2. T as TT in sample1 and sample3.
How should I approach to analyze this kind of data? Substituting As homozygous (C as CC) then running normal GWAS is what I have done. If anyone has alternating ideas i would be grateful.
Why do you only have a single allele for each SNP? You can't just make up the other allele and guess what it is? There's not much point going ahead unless you have the full data, your results will be bunk. Go back to whoever gave you this data and ask for both alleles.
What if it was a haploid organism?
You can use PLINK for haploid sample. Use
--chr-set -X
where X is the number of chromosome of your haploid organism.As for doing the regression. Just pick one of the allele as reference (e.g. T) and then code samples as either having one reference (0) or one alternative (1 for C). Then you can do a regression on the 0, 1 encoding against the phenotype of interest
Is it haploid or diploid? If it's haploid, then you're fine and Sam's comment will help you. If it's diploid, then you will have problems.