Question

salmon, kallisto psuedoaligners are failing to process

1

Entering edit mode

7.6 years ago

dr.prasad.bvls ▴ 10

Hello, I have illumina Data: Trying to use kallisto and Salmon for alignment. 12 PE samples Both are failing to process my data:

Can some one try advising where and what is the problem? Thank you, Prasad.

The errors are pasted below:

*****************************************KALLISTO error on 2nd Sample*********
drsarat@saratngs:~/Downloads/PD_Analysis_4Jun2017/UsingKallisto$ ../../kallisto_linux-v0.43.1/kallisto quant -i ./KallistoDmel.idx -o 2TD -b 100 <(bunzip2 ../NGSData/2-Td_R1.fastq_filtered_trimmed.bz2) <(bunzip2 ../NGSData/2-Td_R2.fastq_filtered_trimmed.bz2)

[quant] fragment length distribution will be estimated from the data
[index] k-mer length: 31
[index] number of targets: 30,485
[index] number of k-mers: 33,122,546
[index] number of equivalence classes: 57,764
[quant] running in paired-end mode
[quant] will process pair 1: /dev/fd/63
                             /dev/fd/62
[quant] finding pseudoalignments for the reads ... done
[quant] processed 0 reads, 0 reads pseudoaligned
[~warn] no reads pseudoaligned.
[quant] estimated average fragment length: 0
[   em] quantifying the abundances ... done
[   em] the Expectation-Maximization algorithm ran for 52 rounds
terminate called after throwing an instance of 'std::domain_error'
  what():  nsamp must be -1 or >=1
Aborted (core dumped)
*********************************************************************

************SALMON error on 1st Sample which is successful with Kallisto** Logs will be written to quants/salmon_1CTR_quant/logs [2017-06-04 06:59:37.477] [jointLog] [info] parsing read library format [2017-06-04 06:59:37.477] [jointLog] [info] There is 1 library. [2017-06-04 06:59:37.549] [stderrLog] [info] Loading Suffix Array [2017-06-04 06:59:37.541] [jointLog] [info] Loading Quasi index [2017-06-04 06:59:37.549] [jointLog] [info] Loading 32-bit quasi index [2017-06-04 06:59:59.952] [stderrLog] [info] Loading Transcript Info [2017-06-04 07:00:05.466] [stderrLog] [info] Loading Rank-Select Bit Array [2017-06-04 07:00:06.385] [stderrLog] [info] There were 30485 set bits in the bit array [2017-06-04 07:00:06.407] [stderrLog] [info] Computing transcript lengths [2017-06-04 07:00:06.408] [stderrLog] [info] Waiting to finish loading hash [2017-06-04 07:00:08.624] [stderrLog] [info] Done loading index

[2017-06-04 07:00:08.624] [jointLog] [info] done
[2017-06-04 07:00:08.624] [jointLog] [info] Index contained 30485 targets








[2017-06-04 07:03:41.142] [jointLog] [info] Computed 0 rich equivalence classes for further p
rocessing
[2017-06-04 07:03:41.142] [jointLog] [info] Counted 0 total reads in the equivalence classes 
[2017-06-04 07:03:41.146] [jointLog] [warning] Only 0 fragments were mapped, but the number o
f burn-in fragments was set to 5000000.
The effective lengths have been computed using the observed mappings.

[2017-06-04 07:03:41.146] [jointLog] [warning] Something seems to be wrong with the calculati
on of the mapping rate.  The recorded ratio is likely wrong.  Please file this as a bug repor
t.

[2017-06-04 07:03:41.146] [jointLog] [info] Mapping rate = 0%

[2017-06-04 07:03:41.146] [jointLog] [info] finished quantifyLibrary()
[2017-06-04 07:03:41.158] [jointLog] [info] Starting optimizer
[2017-06-04 07:03:41.188] [jointLog] [info] Marked 0 weighted equivalence classes as degenera
te
[2017-06-04 07:03:41.188] [jointLog] [info] iteration = 0 | max rel diff. = 0.315097
[2017-06-04 07:03:41.191] [jointLog] [info] iteration = 50 | max rel diff. = -1.79769e+308
[2017-06-04 07:03:41.191] [jointLog] [error] Total alpha weight was too small! Make sure you 
ran salmon correclty.
[2017-06-04 07:03:41.191] [jointLog] [error] The optimization algorithm failed. This is likel
y the result of bad input (or a bug). If you cannot track down the cause, please report this 
issue on GitHub.

RNA-Seq kallisto salmon aligners • 2.7k views

ADD COMMENT • link updated 7.6 years ago by Rob 6.9k • written 7.6 years ago by dr.prasad.bvls ▴ 10

score 2 · Answer 1 · 2017-06-03

The issue is that you have samples where no reads are mapping. When there are no reads, there is no meaningful relative abundance of transcripts. I've recently (in the develop branch of salmon) changed the default behavior to be to warn the user and output a quant.sf file with all 0 TPMs. However, regardless of what the output / behavior is, you can't get any useful information out of a sample where nothing maps to the transcriptome. Are you sure your reference transcriptome is right / matched to your samples?