Thank you for whoever can help me!!!
1) When I ran
$ `java -jar /hpf/tools/centos6/gatk/3.6.0/GenomeAnalysisTK.jar -R genome.fa -T SelectVariants --variant Newsnp.vcf -o testSNP.vcf -sn
AKRJ -sn AJ &
`
2) it shows
ERROR variant contigs = [1, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2, 3, 4, 5, 6, 7, 8, 9, X]
ERROR sequence contigs = [chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chrM, chrX, chrY]
ERROR ------------------------------------------------------------------------------------------
3) the VCF header of Newsnp.vcf is
contig=<ID=1,length=195471971>
contig=<ID=10,length=130694993>
contig=<ID=11,length=122082543>
contig=<ID=12,length=120129022>
contig=<ID=13,length=120421639>
contig=<ID=14,length=124902244>
contig=<ID=15,length=104043685>
contig=<ID=16,length=98207768>
contig=<ID=17,length=94987271>
contig=<ID=18,length=90702639>
contig=<ID=19,length=61431566>
contig=<ID=2,length=182113224>
contig=<ID=3,length=160039680>
contig=<ID=4,length=156508116>
contig=<ID=5,length=151834684>
contig=<ID=6,length=149736546>
contig=<ID=7,length=145441459>
contig=<ID=8,length=129401213>
contig=<ID=9,length=124595110>
contig=<ID=X,length=171031299>
However, the sequence contigs are
chr10,
chr11,
chr12,
chr13,
chr14,
chr15,
chr16,
chr17,
chr18,
chr19,
chr1,
chr2,
chr3,
chr4,
chr5,
chr6,
chr7,
chr8,
chr9,
chrM,
chrX,
chrY
How can I change the VCF header from above to the followings to make it compatible ?
This is the header of VCF should be like after modifying
contig=<ID=chr10,length=130694993>
contig=<ID=chr11,length=122082543>
contig=<ID=chr12,length=120129022>
contig=<ID=chr13,length=120421639>
contig=<ID=chr14,length=124902244>
contig=<ID=chr15,length=104043685>
contig=<ID=chr16,length=98207768>
contig=<ID=chr17,length=94987271>
contig=<ID=chr18,length=90702639>
contig=<ID=chr19,length=61431566>
contig=<ID=chr1,length=195471971>
contig=<ID=chr2,length=182113224>
contig=<ID=chr3,length=160039680>
contig=<ID=chr4,length=156508116>
contig=<ID=chr5,length=151834684>
contig=<ID=chr6,length=149736546>
contig=<ID=chr7,length=145441459>
contig=<ID=chr8,length=129401213>
contig=<ID=chr9,length=124595110>
contig=<ID=chrX,length=171031299>
It would be safer to just use the reference genome which was used for the mapping of the reads.
If the vcf file is not so big I would do it manually with a text editor. If a text editor doesn't work I would try with "sed"
But you have to do it 23 times and maybe there are some problems with characters like "=" or "<"
Not only the header needs modification. Every variant would need modification.