Hi everyone,
I am a newbie to the whole bioinformatics world and I need to analyse WGS data from several case samples. I have now several individual .vcf files and would like to use PLINK for Quality Control analysis, Population Stratification and ultimately GWAS.
As PLINK requires .bim, .fam and .bed files to do such analysis, I need to create a single .ped file from the multiple .vcf files I have.
So, I tried using a for loop:
for file in /path-to-folder-with-all-vcf-files/*.vcf
do
plink --vcf ${file} --allow-extra-chr --recode --out /path-where-I-want-to-save-the-new-ped-file/${file}.ped
done
but my job keeps giving me back the same error:
Exited with exit code 5.
and at the end it simply says:
Error: Missing --vcf parameter.
For more information, try "plink --help <flag name>" or "plink --help | more"
I already read the PLINK documentation but I cannot find the mistake.
From what I understand, I am telling --vcf
to use ${file}
to execute the rest of the commands (--allow-extra-chr
, --recode
and --out
), so why is it telling me there is no parameter?
Or is there another way to convert the multiple .vcf files into a single .ped file that you could recommend?
Thank you in advance for your help!
What's the result of
ls /path-to-folder-with-all-vcf-files/*.vcf*
? If your.vcf
files are actually.vcf.gz
files then your script won't capture those files. Also you might want to consider using plink2, as it's faster and better in most respects.Thank you for the fast response! This was indeed a problem. I changed the command to:
However, this gave me an output of 50 individual .ped, .log, .map and .nosex files. I tried putting together all the 50x .ped files into one, but I am having issues with the name of my samples. They have an underscore (X21-001_a or X21-002_b) to indicate the type of sample used, problem is this underscore shifts the content of the columns to the right (it recognizes the first part as FamilyID=X21-001 and the second part as SampleID=a).
So I wanted to ask you about plink2, I tried asking google but I am lost. Could you please guide me in the right direction to find the type of command I need to use to convert my 50 .vcf.gz files, into a single .ped? or give me an example? Thank you in advance.