Entering edit mode
5.3 years ago
evelyn
▴
230
Hello everyone,
I wanted to merge vcf files using
bcftools merge --file-list sample_list.txt -O v -o merge.vcf
But it gives an error for sample16.vcf.gz
Error: Duplicate sample names (sample16.vcf.gz), use --force-samples to proceed anyway.
Although I do not have any other vcf file with the same name in the same directory. I still used,
bcftools merge --force-samples -m none --file-list sample_list.txt -O v -o merge1.vcf
Now it gives a weird name to that particular sample in the output file:
15:sample15.vcf.gz
I am not sure if it is extracting right information from sample16.vcf file or not. I compared this file column from merged file with individual vcf file and it is not same.
I will appreciate any help to figure out this problem of duplicate names of files. Thank you!
The sample name is not derived from the filename. The sample name is within the vcf file and is the sample name used in the bam file.
So I guess the 10th column in your vcf files have the same header.
Check the output of:
bcftools query -l
does the same thing :-)I was quite sure
bcftools
can do this, but I was to lazy to look up the man page :PEven if it not always true, as a rule of thumb, if there is something you cannot do with your vcf file using
bcftools
than you properly don't need it (or at least you should rethink your problem twice).Is
sample_list.txt
a list of unique file names? Does it contains exactly one column separated by new lines? Can you show us the output ofhead sample_list.txt
?Where does the
.bam
suffix even come from? The files in the file list should be VCF files, not bam files.Yes, sample_list.txt contains one column with unique file names:
These vcf.gz files contain only SNP information. There are no other types of variants. Thanks for pointing out. I have edited my question.
Please try this command:
and paste the output here.