Entering edit mode
5.6 years ago
RNAseqer
▴
270
Is there a command line in vcf tools to pull a subset of snps out of a vcf file based on their rs# ids?
Is there a command line in vcf tools to pull a subset of snps out of a vcf file based on their rs# ids?
Input list:
cat snps.list
rs575563330
rs572898889
rs141149254
Now filter with BCFtools:
bcftools-1.9/bcftools view --include ID==@snps.list 1000Genomes.Norm.bcf | cut -f 1-4
##fileformat=VCFv4.1
##FILTER=<ID=PASS,Description="All filters="" passed"="">
##fileDate=20150218
##reference=ftp://ftp.1000genomes.ebi.ac.uk//vol1/ftp/technical/reference/phase2_reference_assembly_sequence/hs37d5.fa.gz
##source=1000GenomesPhase3Pipeline
##contig=<ID=1,assembly=b37,length=249250621>
##contig=<ID=2,assembly=b37,length=243199373>
##contig=<ID=3,assembly=b37,length=198022430>
##contig=<ID=4,assembly=b37,length=191154276>
... ...
##bcftools_viewVersion=1.9+htslib-1.9
##bcftools_viewCommand=view --include ID==@snps.list 1000Genomes.Norm.bcf; Date=Tue Apr 9 00:04:38 2019
#CHROM POS ID REF
1 54490 rs141149254 G
1 54531 rs572898889 C
1 54566 rs575563330 G
Kevin
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
This worked beautifully, thanks!