Entering edit mode
2.4 years ago
dongxinyu510
•
0
My code is as follows,At the end it has an error and I don't know how to fix it:
shapeit \
-M genetic_map_hg38_withX.txt \
--input-bed chr22.bed chr22.bim chr22.fam \
--input-ref RefPanel.hap.gz RefPanel.legend.gz sample.txt \
--exclude-snp chr22.alignments.snp.strand.exclude\
-O chr22.phased
Segmented HAPlotype Estimation & Imputation Tool
* Authors : Olivier Delaneau, Jared O'Connell, Jean-François Zagury, Jonathan Marchini
* Contact : send an email to the OXSTATGEN mail list https://www.jiscmail.ac.uk/cgi-bin/webadmin?A0=OXSTATGEN
* Webpage : https://mathgen.stats.ox.ac.uk/shapeit
* Version : v2.r904
* Date : 03/08/2022 17:23:38
* LOGfile : [shapeit_03082022_17h23m38s_89c6ccf7-de3b-47ca-a258-5cda5331d54b.log]
MODE -phase : PHASING GENOTYPE DATA
* Autosome (chr1 ... chr22)
* Window-based model (SHAPEIT v2)
* Reference panel of haplotypes used
* MCMC iteration
Parameters :
* Seed : 1659518618
* Parallelisation: 1 threads
* Ref allele is NOT aligned on the reference genome
* MCMC: 35 iterations [7 B + 1 runs of 8 P + 20 M]
* Model: 100 states per window [100 H + 0 PM + 0 R + 0 COV ] / Windows of ~2.0 Mb / Ne = 15000
Reading SNPs to exclude from input file in [chr22.alignments.snp.strand.exclude]
* 277 snps found in the exclude list
Reading site list in [chr22.bim]
* 9486 sites included
* 277 sites excluded
Reading sample list in [chr22.fam]
* 306 samples included
* 306 unrelateds / 0 duos / 0 trios in 306 different families
Reading genotypes in [chr22.bed]
* Plink binary file SNP-major mode
Reading sample list [/gpfs/lab/liangmeng/members/dongxinyu/SHAPEIT/sample.txt]
* 5094 reference haplotypes included
Reading SNPs in [/gpfs/lab/liangmeng/members/dongxinyu/SHAPEIT/RefPanel.legend.gz]
* 9486 reference panel sites included
* 1049593 reference panel sites excluded
Reading reference haplotypes in [/gpfs/lab/liangmeng/members/dongxinyu/SHAPEIT/RefPanel.hap.gz]
ERROR: Line=9799 found=5096 fields, expected=5094
Your reference file seems to be truncated somehow but it's impossible to know how unless we can see those lines of a file. Is there any reason you are using a piece of software that is many versions out of date? You should strongly consider using a more recent version here. It'll also mean you won't have to work with antiquated file formats like .hap.
Your mean is that the version of SHAPEIT2 is too old ?
It's very old compared to the new version - shapeit4 is much, much faster (saving lots of energy in the process!) and more accurate and uses more standard file formats (vcf). I'm not related to shapeit at all, I just like to encourage people to use the most up to date software where they can. shapeit2 won't give you 'bad' results though.