Entering edit mode
6.5 years ago
marongiu.luigi
▴
730
Dear all,
I have aligned the same input fastq files with both BWA and HISAT2 with the commands, respectively:
bwa mem -t 10 <ref> <file_1.fq.gz> <file_2.fq.gz> -o <bwa_aln.sam>
hisat2 -p 10 -q -x <ref> -1 <file_1.fq.gz> -2 <file_2.fq.gz> -S <hst_aln.sam>
then converted to BAM with:
samtools view -Sb <xxx_aln.sam> > <xxx_aln.bam>
I then compared the header and counted the number of reads in each BAM file with:
$ samtools view -H <bwa_aln.bam>
@SQ SN:21 LN:46709983
@PG ID:bwa PN:bwa VN:0.7.17-r1188 CL:bwa mem -t 10 ./ref/GRCh38-21.fa 501N-1_1.fq.gz 501N-1_2.fq.gz -o aln/501N_bwa.sam
$ samtools view -H <hst_aln.bam>
@HD VN:1.0 SO:unsorted
@SQ SN:21 LN:46709983
@PG ID:hisat2 PN:hisat2 VN:2.1.0 CL:"/home/gigiux/src/hisat2/hisat2-align-s --wrapper basic-0 -p 10 -q -x ./ref/GRCh38-21.fa -S aln/501N_hst.sam -1 /tmp/9196.inpipe1 -2 /tmp/9196.inpipe2"
and
$ samtools view -c 501N_bwa.bam
379141213
$ samtools view -c 501N_hst.bam
384279994
This is unexpected: shouldn't both files have the same number of reads? Any reasons why BWA shows 5 138 781 reads less than HISAT2?
PS: also BWA's header misses the HD flag...
Thank you
In addition to what Wouter suggested, it's good to understand the HISAT alignment summary. Check my earlier post here
thank you, this was also interesting