Hello,
I have two sets of SNP data that were aligned, one in AGCT format and one in A/B format. For A/B format I know the alleles. However, there is difference in allele designation in two formats. Something like this:
marker1 chr pos alleles_set1 snp1_set1 snp2_set1 snp3_set1 alleles_set2 snp1_set2 snp2_set2 snp3_set2
m1 1 0 A/G G A A A/G A A B
m2 1 0 A/G A A G T/C A B B
m3 1 0 G/C G G C C/G A A B
I need to produce a hapmap file to make association analyses.
So, my questions are: 1) How to change set2 from A/B format to AGCT? 2) When there is difference in alleles like in marker2, how to treat this data? Especially C/G and G/C?
Hi, how did you solve the problem.
I have run a file on plink in AB format and now trying to do imputation on beagle 4. but beagle4 requires that i have ACGT coding instead of B. I would like to convert the AB to ACGT. your help will be highly appreciated
Hi, do you have an information about the alleles? if yes, you can apply the rules above but you will also need to know the TOP/BOT information of the allelic variations. I had to request an additional file from the company. Hope I was helpful. Here is a link you might find helpful as well: http://www.illumina.com/documents/products/technotes/technote_topbot.pdf
Hi sorry for late reply, i ended up using the old beagle version which accept the AB format. thanks for the file, I already had it
Hi Malomane, probably a question not related to the topic, but do you know for Beagle, if a parent has many offspring, how could we order / arrange the genotype input file for unphased trio data? Is it
Male1, Female1, Off1, Male1, Female1, Off2
or simply asMale1, Female1, Off1, Off2, Off3.....
?? Thanks.