Regular Expression help for (bcf/vcf)tools filter manipulation
1
1
Entering edit mode
7.5 years ago
morovatunc ▴ 560

Hi,

I annotated the flanking regions of SNV mutations with vcftools fill-fs tool. In example;

20  10639675    .   C   G   .   .   Callers=broad,dkfz,muse,sanger;NumCallers=4;VAF=0.3571;repeat_masker=L1M5;t_alt_count=25;t_ref_count=45;Variant_Classification=Intron;FS=t[C/G]c

it added FS=t[C/G]c where;

  • T is the 5prime flanking base and C is the 3 prime flanking base.
  • C is the reference and G is the alternated base.

I would like to subset mutations which are similar to followng pattern.

N[T/A]G or C[A/T]N

If you can answer my question by using bcftools/vcftools, it would be very good but I am okay with any other solutions. I know bcftools view -i has a filter option but I couldnt figure it out.

Thank you very much for your help,

Best regards,

Tunc.

vcftools bcftools • 2.4k views
ADD COMMENT
4
Entering edit mode
7.5 years ago
grep -E '\[T/A]g|c\[A/T]|^#' input.vcf > output.vcf

-E is for regexp grep

[ is to search for [ in regexp instead of opening construct []

| is for OR

^ is to search for the first symbol so ^# is to preserve the header of vcf

in place of \[T/A]g you may want to use FS=?\[T/A]g if INFO field has such constructs for data other than FS

ADD COMMENT
0
Entering edit mode

Thank you very much.

ADD REPLY

Login before adding your answer.

Traffic: 2667 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6