I met some questions when using GATK -UnifiedGenoTyper to call SNPs. The code and Error message like this:
java -jar /WPS/BP/huboqiang/software/GenomeAnalysisTK-1.6-13-g91f02df/GenomeAnalysisTK.jar -T UnifiedGenotyper -I s_6_HW.sorted.bam -R ../hg19.2.fa -o my.vcf --output_mode EMIT_ALL_SITES
##### ERROR MESSAGE: SAM/BAM file SAMFileReader{/WPS/BP/huboqiang/data/BamResult/headerRG.sorted.bam} is malformed: Read HWUSI-EAS535_0025:6:10:13898:5547#0 is either missing the read group or its read group is not defined in the BAM header, both of which are required by the GATK. Please use http://www.broadinstitute.org/gsa/wiki/index.php/ReplaceReadGroups to fix this problem
However, this website has been closed so I do not know how to deal with that problem. The Reads mentioned above is:
HWUSI-EAS535_0025:6:1:15164:14102#0 99 chr1 12096 0 76M = 12243 223 TAGAGTGGGATGGGCCATTGTTCATCTTCTGGCCCCTGTTGTCTGCATGTAACTTAATACCACAACCAGGCATAGG hhhfhhhhhhfhhghffhhgfhhhhhghhehhhhhghhhhgghhghhhhgahghhhhhhfhhghghhhhfehhQd] XT:A:R NM:i:0 SM:i:0 AM:i:0 X0:i:7 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:76
HWUSI-EAS535_0025:6:1:15164:14102#0 147 chr1 12243 0 76M = 12096 -223 GCTCATCTCCTTGGCTGTGATACGTGGCCGGCCCTCGCTCCAGCAGCTGGACCCCTACCTGCCGTCTGCTGCCATC Qa]aaW^X^\[Rb[_W]dcddb]deaadgggaf_a]da_Wffcc`fdfdf]dfdffefffcdcgfggagggegggg XT:A:R NM:i:0 SM:i:0 AM:i:0 X0:i:3 X1:i:5 XM:i:0 XO:i:0 XG:i:0 MD:Z:76
And this is a part of the head of the bam file:
@HD VN:1.0
@SQ SN:chr1 LN:249250621
@RG ID:s_6.bam2 PL:illumina PU:HWUSI-EAS535_0025 SM:s_6.bam
@PG ID:bwa PN:bwa VN:0.6.1-r104
As it mentioned "Read HWUSI-EAS535_0025:6:10:13898:5547#0 is either missing the read group or its read group is not defined in the BAM header, both of which are required by the GATK.", I do not know what's wrong with the read group @RG or with the BAM header.
Thanks.
You could use Picard AddOrReplaceReadGroups as follows:
I'm sorry Matt I had a hard time logging in this website these days. Thanks, it really works!