Subset a VCF based on a certain string within VEP/ClinVar annotation
1
1
Entering edit mode
2.6 years ago
Vanish007 ▴ 50

Hi all,

I am trying to subset my extremely large vcf file of around 11,000 samples based on the string "pathogenic". I tried the following:

bcftools view -i 'CSQ="Pathogenic"' -o PathOnlyClinVar.vcf Merged.vcf.gz

which I believe should have worked, but it only returned the header without any other information. I mainly want to safely subset my vcf file with ClinVar information that contains the string "Pathogenic".

Thanks!

bcftools vcf • 828 views
ADD COMMENT
3
Entering edit mode
2.6 years ago

https://samtools.github.io/bcftools/bcftools.html#expressions

regex operators "~" and its negation "!~". The expressions are case sensitive unless "/i" is added.

bcftools view -i 'CSQ ~ "Pathogenic"' -o PathOnlyClinVar.vcf Merged.vcf.gz
ADD COMMENT
0
Entering edit mode

Thank you, that seemed to do the trick! I'll have to brush up on my regex :)

ADD REPLY

Login before adding your answer.

Traffic: 1806 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6