Hi friends,
I am attempting to sort my bam files that I obtained from my bowtie sam files. I am not indexing them appropriate according to this error I am receiving after creating my bam file.
random alignment retrieval only works for indexed BAM or CRAM files.
I understand I am suppose to index the file before sorting them.
#creating the appropriate files
samtools view -Sb sample.sam.pair > sample.pair
samtools view -bt ~/bigdata/refgenome/genome.fa.fai - - | samtools sort sample.pair -o sample.pair.bam
samtools view -Sb sample.sam.single > sample.single
samtools view -bt ~/bigdata/refgenome/genome.fa.fai - - | samtools sort sample.single -o sample.single.bam
#merge
samtools merge sample.all.bam sample.pair.bam sample.single.bam -@ 2
rm sample.pair sample.single
#index the final bam
samtools index sample.all.bam
Any help would be appreciated.
With the latest
samtools
that command should besamtools sort -o sorted.bam initial.bam
.Oh they changed the syntax to be explicit!? Finally :D
would this take into account my .fai file?
You are still over-thinking, the fasta and bam indexes are two separate and independent things - you don't need one to have the other.
Indexing allows for efficient data access and retrieval. The fasta index (.fai) is used to access and retrieve subsets of the fasta sequence, and the bam index (.bai) to access and retrieve subsets of the bam file.
Oh my goodness.... Thank you both for explaining this to me. I really appreciate it! I only keep talking about my .fai file because my PI left me some code that I could base it off of and it has it on there but I couldn't understand how it was implemented. Thank you.
You're very welcome - if you run into any more complications please don't hesitate to open another question :)