Filter VCF with txt
2
0
Entering edit mode
7.2 years ago
leeandroid ▴ 130

Hi everyone.

I'm trying to filter a vcf with a txt that has only the ID column but I keep getting the following error message: [E::bcf_sr_regions_init] Could not parse the file list.txt, using the columns 1,2[,-1] Failed to read the targets: ^list.txt

I've been using the following command: bcftools view -T ^list.txt my.vcf > vcf_filtered

Is it possible to do such operation or should my txt have more fields? Keep in mind that my goal is to only keep the snp's listed in the txt.

Thank you in advance.

snp vcf filtering • 3.2k views
ADD COMMENT
2
Entering edit mode
7.2 years ago
$ grep ^# snps.vcf > snps.header.vcf
$ grep -F -f list.txt snps.vcf > snps.filtered.noHeader.txt
$ cat snps.header.vcf snps.filtered.noHeader.txt > snps.filtered.withHeader.vcf

If you want to be fancypants and not waste time making intermediate files, this would be faster:

$ cat <(grep ^# snps.vcf) <(grep -F -f list.txt snps.vcf) > snps.filtered.withHeader.vcf

If you want to make it yet faster:

$ LC_ALL=C
$ cat <(grep ^# snps.vcf) <(grep -F -f list.txt snps.vcf) > snps.filtered.withHeader.vcf

It's probably unlikely that VCF files contain Unicode characters, and so limiting the character set to ASCII will make pattern matching with grep much faster.

ADD COMMENT
1
Entering edit mode

Thank you, Alex. It worked!

ADD REPLY

Login before adding your answer.

Traffic: 2614 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6