I am new to Plink, and have some mouse data from the gigamuga sequencing platform, which I have combined in the R package argyle.
When I try to run these plink files in plink, it comes up with the error: There is an error in line 2 of the bim file.
The bim file starts as such:
1 UNC6 0.000931264046131699 3010274 T C haplotype_discrimination TRUE TRUE TRUE 1 rs31443144 CTGTTTAAAACCTGAAAATGGAAATAGAAGCAATAAAGACATCACAAACT NA
1 UNCJPD000001 0.00583195724431707 3064340 C A wild_alleles FALSE TRUE TRUE 3 NA GTCTATCTATCTATCTAAATATCTATCGATCTATCTATCTATCTATCAAT NA
1 UNCHS000001 0.0112903755423514 3124559 T C recomb_hotspot FALSE TRUE TRUE 3 rs31289549 AATACATCTCATCATAGAAGCTGTAGTATGTCACAGTGTGTCACACAGTA NA
I have noticed that other .bim files do not have the extra columns beyond the two SNPs, and when I remove these it does run better, however then when it gets to the .bed file it says the .bim has been edited and again stops working.
Any help with formatting these files would be most appreciated!
As the formatting of that is unclear I will re-enter it. Here are the first three lines of the .bim file (each starting with 1).
Could you please provide a link where original files are located for download and your intermediate combined plink files are? From your post and the comment, I am very confused about the formatting of whitespaces. as it is very different between the two.