Entering edit mode
9.9 years ago
maroisf
▴
10
Hello Biostars,
I recently imputed genotyped dataset using SHAPEIT and IMPUTE2. The resulting files are of the gen/sample format (Oxford). Because I am using the latest 1000 genome reference dataset I end up with massive gen files: chr is ~130 Gb. I would like to reorder the ids in the produced gen/sample file. looked in software such as GTOOL or QCTOOL for such options but I cannot find any.
Is anyone aware of a software that could do such a thing or do I have to code something myself?
Thank you
François
Thank you for your answer.
I'm sorry I think I was not clear on my question.
Here is an explanation taken from the oxford website on Gen file format:
Suppose you want to create a genotype for 2 individuals at 5 SNPs whose genotypes are
The correct gen file would be
Along with the Gen file is a SAMPLE file :
The sample file ID (person) order corresponds to the Gen file column order. Also each ID of the sample file is associated to 3 columns in the Gen file. Therefore in this example the column 6,7 and 8 of the Gen file correspond to the ID 1 and columns 9,10 and 11 correspond to the ID 2.
My problem is that I would like to change the order of the ID in the sample file and hence the column order in the Gen file. Keeping in mind that I have over 3000 IDS in my sample file and over 6 000 000 lines X 10 000 columns in my gen file.
Thank you