Entering edit mode
13 months ago
S
•
0
Hello,
I am attempting to update the sex information of my cohort vcf data by using PLINK.
This is the command I am running:
plink --bed input_bed --bim input_bim --fam input_fam --update-sex input_sex.txt --make-bed ---out output_name > stdout.out
For some reason, I am getting the error in the stdout.out: "--update-sex: 0 people updated, 1066 IDs not present". I realized this may be a mapping issue between the .fam file I am trying to update and my sex.text file. Here's what each of them are:
.fam file (produced from --make-bed and an input vcf file)
0001 1 0 0 0 -9
0002 1 0 0 0 -9
0003 1 0 0 0 -9
0004 1 0 0 0 -9
0005 1 0 0 0 -9
0006 1 0 0 0 -9
0007 1 0 0 0 -9
0008 1 0 0 0 -9
0009 1 0 0 0 -9
0010 1 0 0 0 -9
sex.txt file (produced from the cohort's pedigree file)
FID IID Sex
0001 0001_1 1
0002 0004_1 2
0003 0007_1 1
0004 0010_1 1
0005 0015_1 1
0006 0016_1 2
0008 0022_1 2
0010 0028_1 2
What is going wrong here? And how could I fix it?
Thank you, I really appreciate any help.
I also think an issue may be in the initial making of the .fam file from --make-bed from the initial vcf file.
CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 0001_1 0002_1 0003_1 0004_1 0005_1
This is the headerline with the five sample IDs from the VCF file, once all the bfiles are made, this is the fam file:
I am unsure why the individual IDs are not kept. Does anyone know?
For reference, in a .fam file the components are Family ID, Individual ID, Paternal ID, Maternal ID, and Sex. Clearly here the individual IDs are wrong (all of them are 1).
Thank you again.
I believe plink understands underscore "_" as a separator between FID and IID. If you look at your original IDs, they are all numbered and then followed by _1, that's probably why you are encountering same IID. So, I believe solving this issue requires just replacing the xxxx_1 by 1 along the sex.txt file
P.s. I know it's a very old issue (I just signed up here), but I hope others can benefit somehow from the small detail.