Question

running salmon on bam files

0

Entering edit mode

4.3 years ago

inah • 0

Hi, I am trying to run Salmon on bam files generated by STAR. I am new to using Salmon. I know that I cannot run Salmon on bam files obtained by aligning to the genome. But to my knowledge aligning to the genome is somewhat better than aligning to the transcriptome. Therefore, I generated bam files by aligning to the genome and outputting the alignments in transcriptome coordinates by setting --quantMode TranscriptomeSAM in STAR. I assume that before running Salmon on the bam files I have to randomize them, which I tried to do using samtools collate. I have used collate in the past without any problems on bam files generated by alignment to the genome. But now, when I try to run collate on the bam files generated with --quantMode TranscriptomeSAM, then I get error messages that look like this:

[E::bam_read1] CIGAR and query sequence lengths differ for A00257:310:HYL2GDSXX:1:1312:3115:21449
Error reading input file

So first, can I run Salmon on bam files generated by aligning to the genome and setting --quantMode TranscriptomeSAM in STAR? Or do I have to generate the bam files by aligning to the transcriptome? If the former is true, then any idea about the issue with collate? Thanks, Ina

salmon STAR collate • 1.8k views

ADD COMMENT • link updated 4.3 years ago by GenoMax 148k • written 4.3 years ago by inah • 0

0

Entering edit mode

You can run Salmon directly on the output files from STAR --quantMode TranscriptomeSAM, they are not sorted by position.

ADD REPLY • link 4.3 years ago by h.mon 35k