Somatic mutation calling usually uses bwa+GATK-mutect2 in our group. The output vcf file will show some information if a mutation located in short tandem repeat (STR)region, which contaions RPA, RU and STR. The explation as following:
##INFO=<ID=RPA,Number=.,Type=Integer,Description="Number of times tandem repeat unit is repeated, for each allele (in cluding reference)">
##INFO=<ID=RU,Number=1,Type=String,Description="Tandem repeat unit (bases)">
##INFO=<ID=STR,Number=0,Type=Flag,Description="Variant is a short tandem repeat">
When we to filter some using FilterMutectCalls, we also find some mutations will be marked with "str_contraction" in Filter column , and str_contraction means mutect2 reject the mutation. It shows that all the mutations marked with str_contraction have only one repeat number difference between reference and allele. Besides, these mutations must locates in the reference STR region with not less than 8bp.
I want to know whether or not reasonable to filter just based on "str_contraction", or there are some other useful methods ?
Thanks!