Hello,
I conducted imputation using TOPMED, and received an output of vcf.gz files separated for each 22 chromosomes. I would like to now compute a PRS analysis using PRSice-2.
How can I properly convert my vcf.gz files to concated plink binary file or concated .bgen file and retain the snpIDs, pvalues, and alleles column?
According to the PRSice tutorial as well as other forums, PRSice does not accept vcf.gz files, only plink bed files or .bgen files.
Thus, I attempted to make a concated .bgen file using this:
vcf-concat *.vcf.gz | gzip -c > imputedtopmedresults.concat.ALLchrs.vcf.gz
ml qctool
qctool -g imputedtopmedresults.concat.ALLchrs.vcf.gz -vcf-genotype-field GP -og imputedtopmedresults.concat.ALLchrs.converted.bgen
I then fed this imputedtopmedresults.concat.ALLchrs.converted.bgen
file in as my base data for the PRSice code:
Rscript PRSice.R \
--prsice ./PRSice_linux \
--base imputedtopmedresults.concat.ALLchrs.converted.bgen \
--target MDD.QC.gz \
--thread 1 \
--stat BETA \
--beta \
--binary-target F
This error was returned:
Error: Column for the effective allele must be provided!
Error: Column for the SNP ID must be provided!
Error: Column for the P-value must be provided!
During the conversion from vcf.gz to .bgen, it was clear that my snp-id's pvalues, and alleles were not retained. I then tried to convert my vfc.gz files using another method, to plink binary files:
for i in {1..22}; do
bcftools norm -Ob -m-any chr$i.dose.vcf.gz > chr$i.dose.bcf
done
for i in {1..22}; do
bcftools index chr$i.dose.bcf
done
ml plink
for i in {1..22}; do
plink --bcf chr$i.dose.bcf --const-fid 0 --make-bed --out chr$i_ped; done
I fed the plink binary file into PRSice and the same error occurred.
I went back to check the vcf.gz file and these headers are there:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT
How can I properly convert my vcf.gz files to concated plink binary file or concated .bgen file and retain the snpIDs, pvalues, and alleles column?
Or perhaps TOPMED doesn't provide pvalues, etc., and I am missing something here...?
Thank you