I have a large snp data in plink format. I used shapeit2 to do the phasing steps and output file is gwas.phased.haps gwas.phased.sample which is normal.
.haps file:
22 rs35416799 16869887 A G
22 rs115144709 17000277 A G
22 rs5746642 17055818 A G
and then I used the gwas.phased.haps file in impute2 to do the imputation step. It did generated the .impute2 file together with other files, however, it goes like this:
.impute2 file:
--- 22:16050075:A:G 16050075 A G
--- 22:16050115:G:A 16050115 G A
--- 22:16050213:C:T 16050213 C T
--- rs367963583:16050922:T:G 16050922 T G
is it normal output? Because I saw impute2 example file, there are some .impute2 file but not like this. Thank you for your advice.
Desired output:
22 16050075 16050075 A G
22 16050115 16050115 G A
22 16050213 16050213 C T
22 rs367963583 16050922 T G
Typically the next step after imputation is additional QC, followed by association testing in a program like SNPTEST.
The output you show above should be OK, but we need to know the next step that you intend in order to really be able to answer this.
Also, when you ran shapeit2, did you conduct strand flipping?
Good luck!