Hi,
I am having the following issue. I used STACKS to produce a .vcf file and it does not contain header for contigs, and I need this in order to edit the ID column which is all 0s at the moment (see below, well, it is a bit mixed up). This is how the vcf file looks like (header and a few more rows):
##fileformat=VCFv4.2
##fileDate=20191104
##source="Stacks v2.41"
##INFO=<ID=AD,Number=R,Type=Integer,Description="Total Depth for Each Allele">
##INFO=<ID=AF,Number=A,Type=Float,Description="Allele Frequency">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
##INFO=<ID=NS,Number=1,Type=Integer,Description="Number of Samples With Data">
##FORMAT=<ID=AD,Number=R,Type=Integer,Description="Allele Depth">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth">
##FORMAT=<ID=HQ,Number=2,Type=Integer,Description="Haplotype Quality">
##FORMAT=<ID=GL,Number=G,Type=Float,Description="Genotype Likelihood">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##INFO=<ID=loc_strand,Number=1,Type=Character,Description="Genomic strand the corresponding Stacks locus aligns on">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Lib1_3A Lib1_3B Lib1_3C Lib1_3D Lib1_3E Lib1_3F Lib1_4A Lib1_4B Lib1_4C Lib1_4D Lib1_4E Lib1_4F Lib1_7A Lib1_7A2 Lib1_7D Lib1_7D2 Lib1_7E Lib1_7F
scaffold_45 10 0 A C 0
And when I use this option to edit the #IDs (I want IDs to be like scaffold_45_10, which is scaffold _CHROM_POS):
bcftools annotate --set-id +'%CHROM\_%POS' myfilename.vcf > new.vcf.
I get this error:
[W::vcf_parse] Contig 'scaffold_45' is not defined in the header. (Quick workaround: index the file with tabix.)
Encountered error, cannot proceed. Please check the error output above.
Therefore I would need to define contigs in the header (that is what the error says). To do so, I tried what is explain in the last message in this post: bcftools_Issue#766 Following exactly what it is said there I got a vcf file that looks like this:
##contig=<ID=scaffold_45, length=120>
##contig=<ID=scaffold_62, length=125>
##contig=<ID=scaffold_69, length=132>
##contig=<ID=scaffold_98, length=172>
##contig=<ID=scaffold_154, length=149>
##contig=<ID=scaffold_210, length=184>
(there are more rows)
**VIM - Vi IMproved 8.0 (2016 Sep 12, compiled Jun 21 2019 04:10:35)**
scaffold_45 10 0 A C 0 PASS
Which is wrong. The header that was there has been replaced by the list of contigs IDs and lengths (which is what I want to include in the header but without losing the header that was there). There is also one extra row (VIM - Vi IMproved 8.0 (2016 Sep 12, compiled Jun 21 2019 04:10:35) that I don't know how it appeared and should not be there.
In short, I need to be able to edit the IDs of the vcf file, and for that I need to add the contig info in the header of the vcf file. The reason I want the IDs in the vcf file is to filter SNPs by IDs with vcftools. Does someone know how to solve this issue?
Thanks in advance,
'Angela
PS: I may tray to include the header of my initial vcf file into the header.txt and see what happens...