Hi there,
My goal is to perform a GWAS on RILs, which have all the genotypic information available online in tgeno, bed, bim, fam and vcf formats. I am trying to reduce the size of these files so that they no longer relate to a panel of 205, but a subset (~50) of these RILs.
I created a ped file and then in command line put: plink --file dgrp --keep AllLinesExp.txt
The AllLinesExp.txt file is a list of all the RIL lines (Family ID/Sample ID) that I will be conducting an experiment on. About an hr into the analysis I received an output, with these couple of annoying lines midway through:
Reading individuals to keep [ AllLinesExp.txt ] ... 0 read
205 individuals removed with --keep option
I tried a txt file that just had Family ID's, e.g.:
line_21
line_40
line_65
and then I tried a txt file with Family ID/Sample ID, e.g.:
line_21/line_21
line_40/line_40
line_65/line_65
Using both these txt files yields 0 read individuals, so all are removed and the analysis stops.
Please help :)
In your Family ID/Sample ID file, did you have a literal '/' between the IDs, or did you have a space or tab? (Space or tab should work; I'd need more information to figure out what went wrong in that case.)
I had a slash... Am going to try a space or tab now :)
Space only read 1 RIL; but tab worked, and read in my subset of 55 RILs! Thank you!!
Hi Cinnie83, I am trying to remove ID patients from my data and I am using the original PED file for doing that. I create a .txt file with the number of ID family and ID patients that I want to remove put in two columns, but it still doesn't work. The analysis seems to go until the end of the process (creating temporary files) when appears the message saying: Error: duplicates ID.
My command is: $ ./plink --file name --remove IDlist.txt --out subset2 --make-bed
And my IDlist.txt is:
1 2204 2 1146
So I know I have few duplicates but I don't understand why the presence of duplicates does not allow the removing process.
How did you sort out your problem? Do you mind explaining here?
Hi sorry apparently I replied in a new convo down there ˯˯˯
In addition to my reply below, I had a quick look around biostars and it seems that if you have any hyphen characters in the VCF file they must match exactly with the text file or it won't work. Also, if using a Mac, you can select what encoding is used on the text file (unsure about PC) - mix up the encoding when you save the file from the default and try Western - Windows as that's worked for someone else on here.