Entering edit mode
8.3 years ago
MAPK
★
2.1k
I have a bam file called contaminated_sample.bam
, that was created (for dilution experiment) by mixing proportional amount of reads from two original bam files (Sample_A.bam
and Sample_B.bam
). I want to create a VCF file using GATK genotyper. This requires me to change the SM tags in contaminated_sample.bam
: SM:Sample_A
and SM:Sample_B
to one unique sample name so that the VCF file will have only one sample (i.e. contaminated_sample
). How can I do this using samtools?
Thanks. Another question I have is, do I also need to change the tags below or can I only change the SM tag from two different bam files:
RGID=4 \ RGLB=lib1 \ RGPL=illumina \ RGPU=unit1 \
These tags are also different that came from two original bam files. Can I only change the SM tag and not worry about these other tags for creating a vcf file?
If you run AddOrReplaceReadGroups on your final bam, it will
replace all the existing tags
. So you will have only 1 read group information and the file istreated as single sample, instead of two
, so you will have single VCF.