Problem of bcftools merge TCGA vcf files
1
0
Entering edit mode
4.2 years ago
l66081129 • 0

hi, everyone I am college students and try to do case-control GWAS through TCGA vcf data. i use gdc-client download TCGA vcfs, and i got many single-samples vcf . Therefore, i attempt to merge single-samples vcf into multiples by bcftools. I encounter problems below and i search for some methods on net but no idea.Please help or try to give some ideas how to solve this. this is my commands:

bcftools merge  *.vcf.gz -Oz -o merge.vcf.gz

The Error is : Duplicate sample names (NORMAL), use --force-samples to proceed anyway. so i

bcftools merge --force-samples *.vcf.gz -Oz -o merge.vcf.gz

but i lost most of chrom info in merge.vcf . and try to see the header

zgrep -n '^#CHROM'  merge.vcf.gz |cat

the outcomes are like

CHROM   POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  NORMAL  TUMOR   2:NORMAL    2:TUMOR 3:NORMAL    3:TUMOR 4:NORMAL    4:TUMOR 5:NORMAL    5:TUMOR

In the end ,thank in advance and sorry for my poor english.

SNP merge • 2.0k views
ADD COMMENT
1
Entering edit mode
4.2 years ago

If your vcf samples are all internally named "NORMAL" and "TUMOR", my first suggestion would be to rename them before merging them:

for file in $(ls *.vcf.gz | grep -v renamed | grep -v merge); do
 bcftools reheader -s <(echo ${file/.vcf.gz}) -o ${file/.vcf.gz/.renamed.vcf.gz} $file
 tabix -fp vcf ${file/.vcf.gz/.renamed.vcf.gz}
done

You wouldn't have any problem to merge them afterwards:

bcftools merge -Oz -o merge.vcf.gz *.renamed.vcf.gz

You should see all the correct sample names on the last header line

zgrep ^#CHROM merge.vcf.gz
ADD COMMENT
0
Entering edit mode

thank you for your answer sincerely !! it helps me a lot i,ve tried just now but there is error ,so i take step by step

for file in $(ls *.vcf.gz |grep -v rename |grep -v merge);do bcftools reheader -s ${file/.vcf.gz} -Oz -o ${file/.vcf.gz/.renamed.vcf.gz} $file ;done  2>>error

and error seems like bcftools reheader -s ${file/.vcf.gz} doesn't load vcf header in . i would try to learn and adjust it it suggest me use bcftools view -h old.bcf > header.txt and bcftools reheader -h header.txt

thanks again , have a nice day

ADD REPLY
1
Entering edit mode

Sorry I didn't test the reheader command before. It doesn't accept a sample name but a file containing sample names, nor it doesn't accept the -Oz common bcftools option, so I've edited the code and it now works flawlessly.

ADD REPLY

Login before adding your answer.

Traffic: 1604 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6