Is there a way to force a specific order for variants in a PLINK file that have the same chromosome and bp position?
1
0
Entering edit mode
5.0 years ago
curious ▴ 820

I have a VCF that I am reading into PLINK.

This VCF has many variants that are redundant in position (same chrom + base pair position).

I read this VCF into PLINK to do some data manipulations, then convert PLINK to VCF using the internal PLINK functionality.

I am noticing that the order of variants in the VCF produced by PLINK that have redundant positions do not always write out in an order that reflects the original VCF (im talking about the actual variants not the flipping of alleles).

i.e.

The order in the original VCF:

variant A (redundant position with variant B)

variant B (redundant position with variant A)

might write out form PLINK in the reverse order:

variant B (redundant position with variant A)

variant A (redundant position with variant B)

All the variants with unique positions seem fine.

Is there a way to force a variant order for these redundant variants by using a reference file or is there a way to have PLINK output a file that lists varaints in the exact order that they will be written to a VCF? I thought about using the bim file as a reference for the order that PLINK will use, but I am not sure if that is accurate. Thank you.

plink vcf • 2.4k views
ADD COMMENT
0
Entering edit mode
5.0 years ago

(edit: oops, this answers the wrong question. Leaving this up to provide context for the OP’s response.)

With plink 1.9, —keep-allele-order preserves allele order in the current run, and —a2-allele can be used to re-import allele order.

plink 2.0 defaults to preserving allele order.

ADD COMMENT
0
Entering edit mode

But this is for major minor order of a single variant correct? I am looking for the order of the variants themselves (ie order of rsids)

ADD REPLY
0
Entering edit mode

Yeah, I misread your question.

One way to enforce a specific ID order is:

  1. Create a temporary plink fileset with one sample (give it a new sample ID) and the desired variant ID order. All genotypes can be missing.

  2. plink --bfile temp --bmerge <real fileset> --out merged

  3. plink --bfile merged --remove temp.fam --make-bed sorted

This should work because —bmerge uses the variant ID order in the base fileset when possible.

ADD REPLY
0
Entering edit mode

Thank you so much and PLINK is amazing!

ADD REPLY

Login before adding your answer.

Traffic: 1920 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6