Attempting to generate a bam.bai file but the output is not readable
1
1
Entering edit mode
3.0 years ago
K.patel5 ▴ 150

Hi,

I am new a exome sequencing, and have tried to follow tutorials on the subject. I am stuck at the samtools index stage because the output files are in a non-human readable format and I believe I am making a misstep somewhere. Below I have my code and I have the head of the outputted .bam.bai file.

For simplicity we will assume all files are in the same folder.

#concatenate lanes
cat L001_R1_001.fastq.gz L002_R1_001.fastq.gz L003_R1_001.fastq.gz L004_R1_001.fastq.gz > subject_R1.fastq.gz  
cat L001_R2_001.fastq.gz L002_R2_001.fastq.gz L003_R2_001.fastq.gz L004_R2_001.fastq.gz > subject_R2.fastq.gz  

#index genome
bwa index -t 8 GCA_000001405.15_GRCh38_no_alt_analysis_set.fna GCA_000001405.15_GRCh38_no_alt

#create bam file. Sam file creation is piped to save space.
bwa mem -t 8 GCA_000001405.15_GRCh38_no_alt_analysis_set.fna subject_R1.fastq.gz   subject_R2.fastq.gz -M \
-R "@RG\tID:FlowCell.subject1\tSM:subject1\tPL:illumina\tLB:mito.subject1"   | \
samtools sort -O bam -o ${bamfolder}/subject1_bwa_output.bam 

#create bam.bai file
samtools index -b subject1_bwa_output.bam

#check the bam.bai file
samtools flagstat subject1_bwa_output.bam.bai > subject1_stat_bwa_output.txt 
samtools idxstats subject1_bwa_output.bam.bai > subject1_idxstat_bwa_output.txt

During the checking phase I am getting the following errors.

[E::hts_hopen] Failed to open file subject1_bwa_output.bam.bai
[E::hts_open_format] Failed to open file "subject1_bwa_output.bam.bai" : Exec format error
samtools flagstat: Cannot open input file "subject1_bwa_output.bam.bai": Exec format error
[E::hts_hopen] Failed to open file subject1_bwa_output.bam.bai
[E::hts_open_format] Failed to open file "subject1_bwa_output.bam.bai" : Exec format error
samtools idxstats: failed to open "subject1_bwa_output.bam.bai": Exec format error

Here is a screen-shot of the output from the bam.bai file. I feel like this is not correct.

enter image description here

exome samtools sequencing BAM • 2.3k views
ADD COMMENT
2
Entering edit mode
3.0 years ago
zorbax ▴ 650

you should apply flagstat to the bam (previously indexed) file: samtools flagstat subject1_bwa_output.bam.

ADD COMMENT
0
Entering edit mode

Thanks, that worked.

ADD REPLY
1
Entering edit mode

fyi, the bai file is an index, you do not ever need to interact with it directly. If a tool needs it it will look for it in the same folder as the main bam file automatically. It does not contain human-readable content that could be of interest.

ADD REPLY
0
Entering edit mode

This makes sense. I had assumed the bam.bai file is a summarized copy of the .bam file. Thank you!

ADD REPLY

Login before adding your answer.

Traffic: 2161 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6