Inconsistent bim files after plink plain text format converting
2
0
Entering edit mode
4.8 years ago
yiqiangz ▴ 10

Hi there,

I was trying to split a vcf into two sub-files and to convert both vcfs to plink binary format. But I found problem after converting.

Here is what I did.

I first split vcf files by using "awk 'BEGIN{while(getline<"part1.id")list[$0]=1}NR==1,/#CHROM/{if(x!="")print x;x=$0}/#CHROM/,EOF{printf $1;for(n=2;n<=9;n++){printf ("\t"$n)};for(m=10;m<=NF;m++){if(list[$m]||listA[m]){listA[m]=1;printf ("\t"$m)}}print ""}' all.vcf > part1.vcf".

The part1.id is an id sublist of all ids in the vcf file. Same for part2.id and part2.vcf

For the splitted vcfs (part1.vcf and part2.vcf), vcftools version 0.1.13 was used to convert them into plain text plink format like "vcftools --vcf part1.vcf --plink --out part1", and then plink version 1.9 was used to convert them into plink binary format like "plink --file part1 --make-bed --out part1".

The whole process went smoothly and no error popped out. However, I found that the two bim files generated by plink1.9 differed that some alleles in the last two columns were switched, although the part1 and part2 are originally extract from the same vcf file. For example:

part1.bim: 7 AX-272507051 0 99933869 G A part2.bim: 7 AX-272507051 0 99933869 A G

Is anybody experiencing of that? I'm not sure the problem is caused by vcftools or plink. How could I check the bed file?

Thanks,

plink • 815 views
ADD COMMENT
0
Entering edit mode
4.8 years ago

See https://www.cog-genomics.org/plink/1.9/data#ax_allele . You should use plink 2.0 when you need to preserve REF/ALT allele order.

ADD COMMENT
0
Entering edit mode
4.8 years ago
yiqiangz ▴ 10

Thanks, plink 2.0 works just fine. The fam files produced by plink 1.9 or plink 2.0 differ in colum 1, but both work for subsquent analysis.

ADD COMMENT

Login before adding your answer.

Traffic: 1898 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6