How to add contigs name and length to the header of a VCF file ?
1
0
Entering edit mode
23 months ago
mohsamir2016 ▴ 30

Dear all,

I am trying hard to add contigs information to the header of a vcf file because when I try to modify the names of chromosomes in a VCF file using

bcftools annotate --rename-chrs Names.txt gallus_gallus.vcf -Ov -o gallus_gallusrenamed.vcf

It gave me the following error :

[W::vcf_parse] Contig '1' is not defined in the header. (Quick workaround: index the file with tabix.)
Encountered an error, cannot proceed. Please check the error output above.
If feeling adventurous, use the --force option. (At your own risk!)

By checking the unique names of the contigs in this VCF file using

bcftools query -f '%CHROM\n' gallus_gallus.vcf|uniq

I realized that it gave me: Image 1

So I think I need to first add the contig tag to VCF header

Could any one suggest a code for that?

Thanks

SNP RNA-seq GATK • 3.1k views
ADD COMMENT
2
0
Entering edit mode

Thanks, can you explain more, I think the first option is more convenient (i.e. UpdateVcfSequenceDictionary https://gatk.broadinstitute.org/hc/en-us/articles/360042477832-UpdateVcfSequenceDictionary-Picard-)

Does this means that my command might look like :

gatk UpdateVcfSequenceDictionary -I oldfile.vcf -O newfile.vcf -SD referencegenome.dict

In such case, the input is the old VCF, outputing the newfile, and then I used the dictionarized fasta file of the reference genome ?

Thanks

ADD REPLY

Login before adding your answer.

Traffic: 1987 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6