Hello,
I've been working with a public dataset from the SRA. I've gone through conventional RNAseq workflow of alignment and sorting the SAM files using samtools. I'm now trying to run CuffDiff on the sorted SAM files and submitting a RefSeq GTF file downloaded from UCSC (HG_18). However, my attempts are unsuccessful because I keep getting the following type of error:
[12:33:02] Inspecting maps and determining fragment length distributions.
Error: this SAM file doesn't appear to be correctly sorted!
current hit is at chrX:2710105, last one was at chr22:49568431
You may be able to fix this by running:
$ LC_ALL="C" sort -k 3,3 -k 4,4n input.sam > fixed.sam
My SAM files are sorted in the correct order upon inspect:
chr1
chr2
chr3
...
chrX
chrY
chrM
(however, i tried deleting the chrM aligned reads, hence the above error between chr22 and chrX)
Can anyone provide advice on overcoming this problem?
Thanks!
This is indeed the case.