How to convert plink files to Hapmap Format
1
0
Entering edit mode
8 months ago
Sofia • 0

Is there a way to convert from .bed PLINK format into HapMap genotype format? I've got bed/fam/bim PLINK files that I want to analyse with a program that requires HapMap genotype format

GWAS Plink • 2.1k views
ADD COMMENT
2
Entering edit mode
8 months ago
marco.barr ▴ 150

Hi Sara, You can use plink flags options. For examples:

plink2 --bfile <your_file.bed> --recode --tab --out <your_file.ped>

Try with this command for all yours files.

Regards Marco

ADD COMMENT
0
Entering edit mode

Thanks Marco, btw do you have any idea why I get this "Warning: Skipping --assoc/--model since less than two phenotypes are present." after running association analysis

ADD REPLY
1
Entering edit mode

What type of phenotype is in your data? It looks like it does not have affection status (default values for affection status will be: 1 = unaffected, 2 = affected, 0 = missing, -9 = missing in the 6th column of plink .ped or .fam file). Is it is quantitative trait? If it is a quantitative trait, you can run something like this in plink-

plink --bfile your_data --assoc --pheno qt.phe --out your_data_assoc
#file qt.phe should have info about quantitative trait.
ADD REPLY
0
Entering edit mode

This is how my .fam file looks like enter image description here

And I run the following command: plink --bfile GWAS/samples_delivery --assoc --out GWAS/assoctest and gets this PLINK v1.90b7.2 64-bit (11 Dec 2023) www.cog-genomics.org/plink/1.9/ (C) 2005-2023 Shaun Purcell, Christopher Chang GNU General Public License v3 Logging to GWAS/trying.log. Options in effect: --assoc --bfile GWAS/samples_delivery --out GWAS/trying

7818 MB RAM detected; reserving 3909 MB for main workspace. 712189 variants loaded from .bim file. 133 people (0 males, 0 females, 133 ambiguous) loaded from .fam. Ambiguous sex IDs written to GWAS/trying.nosex . Using 1 thread (no multithreaded calculations invoked). Before main variant filters, 133 founders and 0 nonfounders present. Calculating allele frequencies... done. Warning: Nonmissing nonmale Y chromosome genotype(s) present; many commands treat these as missing. Total genotyping rate is 0.9931. 712189 variants and 133 people pass filters and QC. Note: No phenotypes present. Warning: Skipping --assoc/--model since less than two phenotypes are present.

ADD REPLY
1
Entering edit mode

Please answer my above question. What type of data is it, binary (cases and controls) or quantitative? You need to update Phenotype info in column 6th and Sex info in column 5th of your plink files. If the data is binary and associated with a disease, column 6th need to have 1 = Control, 2 = Cases, 0 = missing, -9 = missing.

ADD REPLY
0
Entering edit mode

the data is binary and associated with a disease but we're studying only the cases and there are other phenotypes of severity

ADD REPLY
1
Entering edit mode

Ok. First update your phenotypes because in the 6th column all the values are -9 meaning missing, and then run --assoc. You can do the followings:-

#Update phenotypes in your data. Here 2=Case
awk '{print $1, $2, "2"}' your_data.fam > pheno.txt

plink --bfile your_data --pheno pheno.txt --make-bed --out your_data_phenoUP

#Then run --assoc
plink --bfile your_data_phenoUP --assoc --out your_data_phenoUP_assoc
ADD REPLY
0
Entering edit mode

Thank you so much for helping me out! I did the steps you mentioned but still got the same issue as shown in the snapshot enter image description here

ADD REPLY
1
Entering edit mode

Hey, can you please show me a few lines of your phenotype updated .fam file? I m seeing there are 7 columns in it, that might be the problem.

ADD REPLY
0
Entering edit mode

yes sure ! enter image description here

ADD REPLY
1
Entering edit mode

Please show me a few lines of phenotype updated .fam file. You can do head your_data_phenoUP.fam and paste result here. Please do not use screenshot.

ADD REPLY
0
Entering edit mode

SC899359_PC75420_A05 SC899359_PC75420_A05 0 0 0 2 SC899360_PC75415_D01 SC899360_PC75415_D01 0 0 0 2 SC899361_PC75425_H05 SC899361_PC75425_H05 0 0 0 2 SC899363_PC75435_H07 SC899363_PC75435_H07 0 0 0 2 SC899365_PC75417_A04 SC899365_PC75417_A04 0 0 0 2 SC899366_PC75423_G10 SC899366_PC75423_G10 0 0 0 2 SC899367_PC75422_A05 SC899367_PC75422_A05 0 0 0 2 SC899368_PC75431_D08 SC899368_PC75431_D08 0 0 0 2 SC899369_PC75433_A02 SC899369_PC75433_A02 0 0 0 2 SC899370_PC75434_F03 SC899370_PC75434_F03 0 0 0 2

ADD REPLY
1
Entering edit mode

This looks good. Please try with using --allow-no-sex flag

plink --bfile your_data_phenoUP --assoc --allow-no-sex --out your_data_phenoUP_assoc

OR you can first impute/update-sex in your Plink file if you have chrX in your data and then run --assoc if you do not want to ignore sex.

plink --bfile your_data_phenoUP --impute-sex --make-bed --out your_data_phenoSexUP
plink --bfile your_data_phenoSexUP --assoc --allow-no-sex --out your_data_phenoSexUP_assoc
ADD REPLY
0
Entering edit mode

Thank you so much, it actually worked!

ADD REPLY
0
Entering edit mode

These are the first lines of the output: (Please is it normal to have NA in the P value column ?)

 CHR                                             SNP         BP   A1      F_A      F_U   A2        CHISQ            P           OR 
   0                          22:24301858_CNV_GSTT2B          0    0       NA       NA    0           NA           NA           NA 
   0                 22:24301858_CNV_GSTT2B_Ilmndup1          0    0        0       NA    C           NA           NA           NA 
   0                 22:24301858_CNV_GSTT2B_Ilmndup2          0    0        0       NA    C           NA           NA           NA 
   0                 22:24301858_CNV_GSTT2B_Ilmndup3          0    0        0       NA    C           NA           NA           NA 
   0                                      rs12097550          0    A  0.03759       NA    G            0            1           NA 
   0                                       rs2229595          0    A 0.003759       NA    G            0            1           NA 
   0                                seq-rs61762504.1          0    0        0       NA    G           NA           NA           NA 
   0                                seq-rs61762504.2          0    0       NA       NA    0           NA           NA           NA 
   1                                       rs9701055     565433    A        0       NA    G           NA           NA           NA 
ADD REPLY
0
Entering edit mode

If you run your data with plink 2.0, you will have ERRCODE column in your result file showing the reason behind "NA" p-value. PLINK p value returning NA

ADD REPLY

Login before adding your answer.

Traffic: 1626 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6