I am trying to index sorted bam files using samtools index. I created the sorted bam files by downsampling from higher depth sorted bam files using samtools depthandsamtools view. Samtools index works on the original high depth sorted bam files but it gives an error for downsampled sorted bam file:
samtools index: example is in a format that cannot be usefully indexed
Are you sure that you've output a BAM file from samtools view? You can confirm with samtools quickcheck [bam]; see here. Additionally, sometimes people forget to include the -h flag in view to include the header - just a thought.
Great! Is there a possibility of running out of space in a temporary directory, or are you executing this on a network filesystem? See here for one possible avenue of investigation. Sorry it's getting a tad technical, but indexing is usually so dialed-in that it's strange if it doesn't work, barring BAM structure-related issues.
Thank you, I actually checked this post before posting my question. I am using a cluster and I believe space is not an issue for this job. I checked all the versions and everything is updated.
What about using a shared filesystem for, e.g., your home directory? Any NFSs mounted that could be indicative of a similar issue to the Github one above?
could be good to look at the commands you are using to create the down-sampling, also some example of the file generated
I used these commands to downsample:
BAM is a compressed binary file, SAM is the same information in human readable form (text), whey you run:
with
view -h
you are decompressing your BAM, so you need to compress it again, like this:It worked. Thank you!
Given the information provided here, should this discussion should be moved to an answer and accepted?