I had a large VCF file named "common_known_variants.vcf " which contains all known human variants downloaded from https://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606_b151_GRCh38p7/VCF/00-common_all.vcf.gz as common_known_variants.vcf.gz
I'm trying to extract the known variants from only chromosomes 1,2,3,9,22, and X and write them in a new vcf file with the name targeted_known_variants.vcf
so I created a txt file named targeted_chromosomes.txt
each line had a no of chromosome from those I needed to extract and I used this command line in a jupyter notebook
!vcftools --vcf common_known_variants.vcf --positions targeted_chromosomes.txt --recode --recode-INFO-all --out targeted_known_variant
Unfortunately it created a new vcf file containing the headers only it seems that the command cannot read the txt file ,,, any help with this? I would be grateful
Welcome Phoebe. vcftools is very outdated now and is not under maintaience. Please use
bcftools
instead. Am I right in thinking you just want to subsetcommon_known_variants.vcf
to contain only chromosomes 1,2,3,9,22, and X?yes that is exactly what I want to do