PLINK .ped file from vcf file with 2 alleles
1
0
Entering edit mode
2.7 years ago
Liubov ▴ 20

Hi!

I got my .vcf files after doing the variant calling with GATK Haplotype caller.

I am new to PLINK, I would like to know how to get a set of PLINK files (.ped, .map) from the vcf file for somatic cells. So far I used the following:

plink --vcf file.vcf --recode --out PLINKfile

But then in the .ped file I have information only about one of the alleles:

person1  person1  0  0  0  -9  G  A  G  C  C  C ...
person2  person2  0  0  0  -9  G  A  G  C  T  C ...
person3  person3  0  0  0  -9  G  T  C  C  C  C ...

As I understand, for SNPs it should have 2 letters at each position, one for each allele, so it should look like this:

person1  person1  0  0  0  -9  GA  AA  GG  CT  CC  CC ...
person2  person2  0  0  0  -9  GA  AA  GG  CC  TC  CC ...
person3  person3  0  0  0  -9  GA  TA  CG  CT  CC  CC ...

How do I do that? Also, is there a way to encode deletions and insertions, especially if they are longer than 1 nucleotide?

Thank you

PLINK PED VCF • 1.1k views
ADD COMMENT
0
Entering edit mode

ped/map is a very outdated and generally poor and memory inefficient way to store data. Would reccomend using another format if you can.

ADD REPLY
3
Entering edit mode
2.7 years ago

PLINK has had better support for VCF than .ped files for the last 7 years. I could answer this question, but I would probably be doing you a disservice; you probably shouldn't bother to learn how this format works at all.

The PLINK formats that are actually worth learning about are .bed and .pgen; those are much more efficient than VCF for their purposes.

ADD COMMENT
1
Entering edit mode

Thank you!

Actually, I think I found the answer myself: there is a space between two alleles, so actually the first way also includes the information about both alleles, it just prints them in 2 consecutive columns.

My plan was to create .bed, .bim and .fam from .ped and .map afterwards, and run DFAM: family-based association for disease traits. But now I think of it, there probably should be a way to create those files directly from .vcf.

UPD: --make_bed should do it

ADD REPLY
0
Entering edit mode

yes, use --make-bed and directly produce the .bim/.bed//.fam files - no need to make intermediary .map/.ped files

ADD REPLY

Login before adding your answer.

Traffic: 2412 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6