plink2 cannot make bed file
1
2
Entering edit mode
2.4 years ago
dec986 ▴ 380

Hello,

I'm trying to make a plink2 file following advice from Converting VCF to PLINK .bed binary fileset to check for pedigree errors with KING: How do conversion tools make the PLINK .fam file, without asking for family relationships a priori?

so I run the command plink2 --vcf 56001801066929_WGZ.snp.vcf.gz --make-bed --out ex and get the output:

PLINK v2.00a3.1LM 64-bit Intel (19 May 2022)   www.cog-genomics.org/plink/2.0/
(C) 2005-2022 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to ex.log.
Options in effect:
  --make-bed
  --out ex
  --vcf 56001801066929_WGZ.snp.vcf.gz

Start time: Wed Jul 20 10:22:04 2022
70358 MiB RAM detected; reserving 35179 MiB for main workspace.
Using up to 10 threads (change this with --threads).
--vcf: 3499678 variants scanned.
--vcf: ex-temporary.pgen + ex-temporary.pvar.zst + ex-temporary.psam written.
1 sample (0 females, 0 males, 1 ambiguous; 1 founder) loaded from
ex-temporary.psam.
3499678 variants loaded from ex-temporary.pvar.zst.
Note: No phenotype data present.
Writing ex.fam ... done.
Writing ex.bim ... 
Error: ex.bim cannot contain multiallelic variants.
End time: Wed Jul 20 10:22:07 2022

which only creates ex.log and ex.fam, so there is no .bed output. I can make the file with plink1.9, but I'm running into other bugs with that one "split chromosome" after sorting with bcftools which is the point of trying plink2.

How can I make a bed file with plink2?

plink2 • 4.7k views
ADD COMMENT
1
Entering edit mode

The error message you're getting from plink2 is that you have multiallelic SNPs in your VCF. Have you tried filtering your VCF to only include biallelic SNPs?

See this discussion for one way to do that:

how to remove multiallelic from VCF

ADD REPLY
0
Entering edit mode

Use bcftools norm (https://samtools.github.io/bcftools/bcftools.html#norm) to split multi-allelic sites to multiple biallelic sites if retaining all the called alleles are crucial for your work. Then use plink2 to make bed with --set-all-var-ids @:# option to name all of your variants so as to distinguish multiple alleles at the same position

ADD REPLY
4
Entering edit mode
2.4 years ago

Answered on plink2-users: https://groups.google.com/g/plink2-users/c/IOUxdtZqFOU/m/dbrCCJ2HBQAJ

Summary: given what you're trying to do, you should be creating a .pgen file, not a .bed file. (There's a narrower direct answer to your question involving the --max-alleles flag, but it would result in unnecessarily throwing away your multiallelic variants when you could keep them with a .pgen.)

ADD COMMENT

Login before adding your answer.

Traffic: 1720 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6