I want to use bcftool (Version: 1.0 (using htslib 1.0)) to edit a vcf file, and then export a updated vcf file or bed file (bed is preferred).
There are several things that I want to do, and I found some relevant command from https://samtools.github.io/bcftools/bcftools.html. But I don't know how to put them together. Particularly I do not even know how to load the original vcf file.
I also found a previous post (Extract subset of samples from multigenome vcf file) on similar topic, but I still do not understand the command there.
bcftools view -Oz -S sample.txt $file > /get/inthis/dir/output_"${i##*/}"_.vcf.gz
1) Subset a sample based on a txt file. This txt file include the sample I want to keep in the vcf.
-S, --samples-file FILE
2) Keep the snps
-v, --types snps
3) Keep only snps with maf > 0.05
I did not find relevant command for this.
4) remove duplicate snp
-d, --rm-dup snps
or
-c, --collapse snps
So what's the problem? Are you getting an error?
bcftools filter
is the command you'll need to filter by MAF, assuming it's one of your INFO fields. Any particular reason you're using such an old version of the tools? The current version is 1.5.You won't be able to output in BED format with
bcftools
, you'll need to use something like BEDOPS' vcf2bed tool to make that conversion.I used this command to subset the European sample from the all sample vcf (ALL.genotypes.vcf.gz), and export the vcf of European sample (EUR.genotypes.vcf.gz).
But I also want to remove duplicate snps. So I used the below command. However, bcftools did not return anything.
So to clarify, after the first command, you still have output, but lose it after the second? What is
snps
doing in that command? The--remove-duplicates
parameter doesn't require you to specify the type of record if I remember correctly.You could also use the
vcfuniq
command from VCFutils to do this.Thank you. You are right, I should not put snps after
--remove-duplicates
. The below command works.I now use bcftools view to export a vcf file for European population, and then use bcftools norm to remove duplicates, then export a second vcffile. But can I use bcftools view and bcftools norm in the same command? I do not actually need the first vcf file.
Yes, you can do both commands in one line with UNIX piping:
That should work and remove the need for the intermediate file. Glad you got it working.
I See. Thank you so much!