hi, everyone I am college students and try to do case-control GWAS through TCGA vcf data. i use gdc-client download TCGA vcfs, and i got many single-samples vcf . Therefore, i attempt to merge single-samples vcf into multiples by bcftools. I encounter problems below and i search for some methods on net but no idea.Please help or try to give some ideas how to solve this. this is my commands:
bcftools merge *.vcf.gz -Oz -o merge.vcf.gz
The Error is : Duplicate sample names (NORMAL), use --force-samples to proceed anyway. so i
bcftools merge --force-samples *.vcf.gz -Oz -o merge.vcf.gz
but i lost most of chrom info in merge.vcf . and try to see the header
zgrep -n '^#CHROM' merge.vcf.gz |cat
the outcomes are like
CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NORMAL TUMOR 2:NORMAL 2:TUMOR 3:NORMAL 3:TUMOR 4:NORMAL 4:TUMOR 5:NORMAL 5:TUMOR
In the end ,thank in advance and sorry for my poor english.
thank you for your answer sincerely !! it helps me a lot i,ve tried just now but there is error ,so i take step by step
and error seems like bcftools reheader -s ${file/.vcf.gz} doesn't load vcf header in . i would try to learn and adjust it it suggest me use bcftools view -h old.bcf > header.txt and bcftools reheader -h header.txt
thanks again , have a nice day
Sorry I didn't test the
reheader
command before. It doesn't accept a sample name but a file containing sample names, nor it doesn't accept the-Oz
commonbcftools
option, so I've edited the code and it now works flawlessly.