Hi, I am a beginner at RNAseq analysis! I have paired end, stranded (reverse direction cDNAs) data. I used STAR to align my data and have SAM files that are sorted by coordinate.
Now I want to get the counts using featureCount. I have seen in some places that featureCount may not be able to use SAM files that are sorted by coordinate, and wants it to be sorted by name. Other documentation I have found talk about this not being an issue.
Can I get some clarity? Can I use the STAR sorted files as an input, or do I need to use SAM tools to change them?
featureCounts
can use coordinate sorted sam files.It will re-sort the files on the fly. From manual:
I was using featureCounts on coordinate-sorted BAM files, and read sorting was incurring a large (>12hrs) time cost. Apparently, featureCounts will only use one thread for read sorting even if specified with more. I pre-sorted by read name with samtools and runtime with featureCounts was VASTLY quicker.