I have 10X data that I would to extract all of the SNPs for a very specific site on one gene to annotate the downstream data. I am having trouble getting the UMIs to appear in the final vcf output of my pipeline. I am using SAMtools to subset the bam to only the site of interest, then use freebayes to generate vcf output:
samtools view -b possorted_genome_bam.bam "3:103060270-103060270" > nras_snp.bam
freebayes -f genome.fa nras_snp.bam > nras.vcf
Does anyone know of additional flags I might need to use for these command line tools to make my UMIs appear in the final vcf output?
I would like to have UMIs as a column in the VCF file and for freebayes to ignore duplicates.