SNPS assocation with different data format
0
1
Entering edit mode
8.1 years ago
forever ▴ 80

Hi everyone, I have SNPS data with the below format and I need to perform SNP association. I do not know how to use R package or Plink with this data format.

SNP Name    Sample ID    Allele1 - Top    Allele2 - Top    GC Score    SNP

chr1:109457160    2    C    C    0.8609    [T/G]

chr1:109457233    2    C    C    0.7725    [T/G]

I have little skills converting data file so, ​I appreciate your reply.​

snp • 1.5k views
ADD COMMENT
0
Entering edit mode

I assume this is array data (although it would be great if you could be more informative in your question). First of all you should know which genome assembly (build) this is from. You should for every position find the major and minor allele and encode the genotypes as 0,1 or 2: 0 for homozygous major allele, 1 for heterozygous, 2 for homozygous minor allele. Then you will need to combine the different samples in a bigger file for plink association analysis.

ADD REPLY
0
Entering edit mode

Thank you for your reply. Actually, it is SNPs association study. data is extracted from Ilimuna. The format of data can be considered by plink as long format file but I need to create the map and fam file. So I have to have lgen, fam and map files to use Plink? The map file shall contain all the snps positions and chromosomes.

ADD REPLY
0
Entering edit mode

Please use ADD REPLY to answer to earlier comments. Data is extracted from llimuna is meaningless, given that Illumina (what you most likely mean) is a company with different technology platforms, including sequencing and array.

ADD REPLY

Login before adding your answer.

Traffic: 1995 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6