Question

Is there a way to force a specific order for variants in a PLINK file that have the same chromosome and bp position?

0

Entering edit mode

5.0 years ago

curious ▴ 820

I have a VCF that I am reading into PLINK.

This VCF has many variants that are redundant in position (same chrom + base pair position).

I read this VCF into PLINK to do some data manipulations, then convert PLINK to VCF using the internal PLINK functionality.

I am noticing that the order of variants in the VCF produced by PLINK that have redundant positions do not always write out in an order that reflects the original VCF (im talking about the actual variants not the flipping of alleles).

i.e.

The order in the original VCF:

variant A (redundant position with variant B)

variant B (redundant position with variant A)

might write out form PLINK in the reverse order:

variant B (redundant position with variant A)

variant A (redundant position with variant B)

All the variants with unique positions seem fine.

Is there a way to force a variant order for these redundant variants by using a reference file or is there a way to have PLINK output a file that lists varaints in the exact order that they will be written to a VCF? I thought about using the bim file as a reference for the order that PLINK will use, but I am not sure if that is accurate. Thank you.

plink vcf • 2.4k views

ADD COMMENT • link updated 5.0 years ago by chrchang523 11k • written 5.0 years ago by curious ▴ 820

score 0 · Answer 1 · 2020-01-01

0

Entering edit mode

5.0 years ago

chrchang523 11k

(edit: oops, this answers the wrong question. Leaving this up to provide context for the OP’s response.)

With plink 1.9, —keep-allele-order preserves allele order in the current run, and —a2-allele can be used to re-import allele order.

plink 2.0 defaults to preserving allele order.

ADD COMMENT • link 5.0 years ago by chrchang523 11k

0

Entering edit mode

But this is for major minor order of a single variant correct? I am looking for the order of the variants themselves (ie order of rsids)

ADD REPLY • link 5.0 years ago by curious ▴ 820

0

Entering edit mode

Yeah, I misread your question.

One way to enforce a specific ID order is:

Create a temporary plink fileset with one sample (give it a new sample ID) and the desired variant ID order. All genotypes can be missing.
plink --bfile temp --bmerge <real fileset> --out merged
plink --bfile merged --remove temp.fam --make-bed sorted

This should work because —bmerge uses the variant ID order in the base fileset when possible.