Hello,
I am using hisat2 for generating sam files using the below command in shell:
./hisat2-2.2.1-Linux_x86_64/hisat2-2.2.1/hisat2 -p 2 --dta -x reference.fasta -1 ./Data/D1.fastq.gz -2 ./Data/D2.fastq.gz -S D.sam
Then using samtools sort command, I generate the bam file:
./samtools-1.15.1.tar/samtools-1.15.1/samtools sort -o /bamOutput/D.bam D.sam
Finally when I want to generate variants calls using the below command:
./gatk/gatk HaplotypeCaller -R reference.fasta -I ./bamOutput/D.bam -O ./Output/D.vcf --minimum-mapping-quality 10 --ploidy 2 -ERC GVCF
It gives me the error that:
A USER ERROR has occurred: Argument emit-ref-confidence has a bad value: Can only be used in single sample mode currently. Use the --sample-name argument to run on a single sample out of a multi-sample BAM file.
I tried to run samtools view command and the output is like below:
@HD VN:1.0 SO:coordinate
@SQ SN:NC_045512.2 LN:29903
@PG ID:hisat2 PN:hisat2 VN:2.2.1 CL:"/mnt/d/Tools/hisat2-2.2.1-Linux_x86_64/hisat2-2.2.1/hisat2-align-s --wrapper basic-0 -p 2 --dta -x reference.fasta -S D.sam --read-lengths 60 -1 /tmp/23220.inpipe1 -2 /tmp/23220.inpipe2"
@PG ID:samtools PN:samtools PP:hisat2 VN: CL:./samtools-1.15.1.tar/samtools-1.15.1/samtools sort -o ./bamOutput/D.bam D.sam
@PG ID:samtools.1 PN:samtools PP:samtools VN: CL:./samtools-1.15.1.tar/samtools-1.15.1/samtools view -H ./AllData/hisat2/D.bam
I found that there is no @RG header line and gatk needs this line to extract variants. I have two questions, first how I can add this line to my sam and bam files?
and second, what are the valid tag:value pairs for this header line?
I really appreciate any recommendation.
If you don't want to rerun the alignment, you can use picard AddOrReplaceReadGroups. However, GATK recommends adding read groups at the alignment step, as Pierre suggested.
output should be name ./Output/D.g.vcf if you're using a GVCF mode.