Adding @RG header line to bam and sam files in hisat2
1
0
Entering edit mode
2.3 years ago
Nemo • 0

Hello,

I am using hisat2 for generating sam files using the below command in shell:

./hisat2-2.2.1-Linux_x86_64/hisat2-2.2.1/hisat2 -p 2 --dta -x reference.fasta -1 ./Data/D1.fastq.gz -2 ./Data/D2.fastq.gz -S D.sam

Then using samtools sort command, I generate the bam file:

./samtools-1.15.1.tar/samtools-1.15.1/samtools sort -o /bamOutput/D.bam D.sam

Finally when I want to generate variants calls using the below command:

./gatk/gatk HaplotypeCaller -R reference.fasta -I ./bamOutput/D.bam -O ./Output/D.vcf --minimum-mapping-quality 10 --ploidy 2 -ERC GVCF

It gives me the error that:

A USER ERROR has occurred: Argument emit-ref-confidence has a bad value: Can only be used in single sample mode currently. Use the --sample-name argument to run on a single sample out of a multi-sample BAM file.

I tried to run samtools view command and the output is like below:

@HD     VN:1.0  SO:coordinate
@SQ     SN:NC_045512.2  LN:29903
@PG     ID:hisat2       PN:hisat2       VN:2.2.1        CL:"/mnt/d/Tools/hisat2-2.2.1-Linux_x86_64/hisat2-2.2.1/hisat2-align-s --wrapper basic-0 -p 2 --dta -x reference.fasta -S D.sam --read-lengths 60 -1 /tmp/23220.inpipe1 -2 /tmp/23220.inpipe2"
@PG     ID:samtools     PN:samtools     PP:hisat2       VN:     CL:./samtools-1.15.1.tar/samtools-1.15.1/samtools sort -o ./bamOutput/D.bam D.sam
@PG     ID:samtools.1   PN:samtools     PP:samtools     VN:     CL:./samtools-1.15.1.tar/samtools-1.15.1/samtools view -H ./AllData/hisat2/D.bam

I found that there is no @RG header line and gatk needs this line to extract variants. I have two questions, first how I can add this line to my sam and bam files?

and second, what are the valid tag:value pairs for this header line?

I really appreciate any recommendation.

bam hisat2 header alignment • 1.9k views
ADD COMMENT
1
Entering edit mode

If you don't want to rerun the alignment, you can use picard AddOrReplaceReadGroups. However, GATK recommends adding read groups at the alignment step, as Pierre suggested.

ADD REPLY
0
Entering edit mode
 -O ./Output/D.vcf  -ERC GVCF

output should be name ./Output/D.g.vcf if you're using a GVCF mode.

ADD REPLY
2
Entering edit mode
2.3 years ago

specify the read group in histat: http://daehwankimlab.github.io/hisat2/manual/

--rg-id <text>
Set the read group ID to <text>. This causes the SAM @RG header line to be printed, with <text> as the value associated with the ID: tag. It also causes the RG:Z: extra field to be attached to each SAM output record, with value set to <text>.

--rg <text>
Add <text> (usually of the form TAG:VAL, e.g. SM:Pool1) as a field on the @RG header line. Note: in order for the @RG line to appear, --rg-id must also be specified. This is because the ID tag is required by the SAM Spec. Specify --rg multiple times to set multiple fields. See the SAM Spec for details about what fields are legal.
ADD COMMENT
0
Entering edit mode

Thanks @Pierre for your reply. I tried to run below command as suggested by histat2 manual:

./hisat2-2.2.1-Linux_x86_64/hisat2-2.2.1/hisat2 --no-spliced-alignment --no-unal --rg-id ID:D10_S24_L001 --rg SM:D10_S24_L001 --rg PL:ILLUMINA --rg PM:HISEQ -p 2 -x reference.fasta -1 ./Data/D1.fastq.gz -2 ./Data/D2.fastq.gz -S D10.sam

It does not throw any error but still does not generate sam file. before using --rg-id and --rg tags it would be able to generate sam files but after editing like this it does not give any error. The logs in both cases are something similar to below:

33583125 reads; of these: 33583125 (100.00%) were paired; of these: 33565755 (99.95%) aligned concordantly 0 times 17370 (0.05%) aligned concordantly exactly 1 time

0 (0.00%) aligned concordantly >1 times
----
33565755 pairs aligned concordantly 0 times; of these:
  4324 (0.01%) aligned discordantly 1 time
----
33561431 pairs aligned 0 times concordantly or discordantly; of these:
  67122862 mates make up the pairs; of these:
    67107582 (99.98%) aligned 0 times
    15280 (0.02%) aligned exactly 1 time
    0 (0.00%) aligned >1 times

0.09% overall alignment rate

I am not sure what is wrong in my command as it does not produce any error nor any sam output file.

ADD REPLY
0
Entering edit mode

still does not generate sam file.

why "still does not generate sam file" ? You had a BAM file in your question has HaplotypeCaller was able to read it and you were able to extract the header with samtools.

ADD REPLY
0
Entering edit mode

I have created a new post in Not producing sam file in hisat2 to not make anybody confused. I would appreciate if you can guide me there.

ADD REPLY

Login before adding your answer.

Traffic: 1782 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6