Hi all! I am having problems and I hope I can get some help from you.
I will explain my situation: I'm trying to perform a PCA analysis to see how different several bam files are. I'm using the next pipeline:
- Getting the accession files. I am using the R library "SRAdb", so I am getting 4 files in .sra format.
I use SRA-tools in order to convert the .sra file into .bam format with the following code:
sam-dump -r --min-mapq 25 $file | samtools view -bS > $file.bam
Sort
samtools sort $file -o $file_sorted
Index
samtools index $file_sorted $file_sorted.bai
Compute a matrix to generate the PCA plot
multiBamSummary bins -b $files.bam -o my/out/path --smartLabels -bs 10000 -p 2
At this point I'm getting the following error:
The file < myfile > does not have BAM or CRAM format.
I haven't been able to trace the error, as any of the earlier steps reported any source of error. Any suggestions? (ideally I would like to skip the alignment step, I want to keep the file as original as possible)
- sra-tools --version 2.9.1_1
- samtools --version 1.9
- deeptools --version 3.3.0
Thanks before hand!!
Can you post example accession numbers so we can see what data you are looking at?
The accession number I am looking at is SRP060510, which consists of 4 samples: SRR2089860, SRR2089861, SRR2089862, SRR2089863
Take a look at the BAM files you've generated - probably there's something wrong with the format. Are these aligned files you are downloading from SRA?
You can also try
samtools quickcheck
on the BAM files you've generated.I have checked. I get the following message: SRR2089860.bam had no targets in header (for all 4 of them)
Any errors? Are you sure the SAM file you are dumping is even aligned? Seconding predeus, check
quickcheck
I have performed other operations:
vdb-validate
-> everything seems to be finefastq-dump
-> resulting in the following error "Error: reads file does not look like a FASTQ file"