You're running that as a loop, right? The --pseudobam
command line parameter outputs to terminal in SAM format. You won't see it if you are running that as a loop, or even as a shell script as they are run as separate processes.
Here's a test that I did on my own computer
./kallisto quant -i test/transcripts -o . --pseudobam \
test/reads_1.fastq.gz test/reads_2.fastq.gz | head -23
[quant] fragment length distribution will be estimated from the data
[index] k-mer length: 31
[index] number of targets: 15
[index] number of k-mers: 18,902
[index] number of equivalence classes: 22
[quant] running in paired-end mode
[quant] will process pair 1: test/reads_1.fastq.gz
test/reads_2.fastq.gz
[quant] finding pseudoalignments for the reads ...@HD VN:1.0
@SQ SN:NM_001168316 LN:2283
@SQ SN:NM_174914 LN:2385
@SQ SN:NR_031764 LN:1853
@SQ SN:NM_004503 LN:1681
@SQ SN:NM_006897 LN:1541
@SQ SN:NM_014212 LN:2037
@SQ SN:NM_014620 LN:2300
@SQ SN:NM_017409 LN:1959
@SQ SN:NM_017410 LN:2396
@SQ SN:NM_018953 LN:1612
@SQ SN:NM_022658 LN:2288
@SQ SN:NM_153633 LN:1666
@SQ SN:NM_153693 LN:2072
@SQ SN:NM_173860 LN:849
@SQ SN:NR_003084 LN:1640
@PG ID:kallisto PN:kallisto VN:0.43.1
1:NM_014620:16:182 99 NM_014620 17 255 50M = 149 182 GTTCCGAGCGCTCCGCAGAACAGTCCTCCCTGTAAGAGCCTAACCATTGC IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII NH:i:3
1:NM_014620:16:182 355 NM_153693 17 255 50M = 149 182 GTTCCGAGCGCTCCGCAGAACAGTCCTCCCTGTAAGAGCCTAACCATTGC IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII NH:i:3
1:NM_014620:16:182 355 NR_003084 17 255 50M = 149 182 GTTCCGAGCGCTCCGCAGAACAGTCCTCCCTGTAAGAGCCTAACCATTGC IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII NH:i:3
1:NM_014620:16:182 147 NM_014620 149 255 50M = 17 -182 TAATTTTTTTTCCTCCCAGGTGGAGTTGCCGAAGCTGGGGGCAGCTGGGG IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII NH:i:3
1:NM_014620:16:182 403 NM_153693 149 255 50M = 17 -182 TAATTTTTTTTCCTCCCAGGTGGAGTTGCCGAAGCTGGGGGCAGCTGGGG IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII NH:i:3
1:NM_014620:16:182 403 NR_003084 149 255 50M = 17 -182 TAATTTTTTTTCCTCCCAGGTGGAGTTGCCGAAGCTGGGGGCAGCTGGGG IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII NH:i:3
It still also produces the standard output files, as you've noticed:
ls -l
total 17084
-rw-rw-r-- 1 kblighe kblighe 35848 Apr 29 13:27 abundance.h5
-rw-rw-r-- 1 kblighe kblighe 591 Apr 29 13:27 abundance.tsv
-rwxr-xr-x 1 kblighe kblighe 17435774 Mar 20 2017 kallisto
-rw-r--r-- 1 kblighe kblighe 1366 Aug 1 2017 license.txt
-rw-r--r-- 1 kblighe kblighe 2253 Mar 20 2017 README.md
-rw-rw-r-- 1 kblighe kblighe 270 Apr 29 13:27 run_info.json
drwxr-xr-x 2 kblighe kblighe 4096 Apr 29 13:25 test
------------------------
So, in your loop, you will have to create a redirection into samtools
to save as SAM/BAM
./kallisto quant -i test/transcripts -o . --pseudobam -b 100 \
test/reads_1.fastq.gz test/reads_2.fastq.gz | samtools view -bS - > align.bam
ls -l
total 17796
-rw-rw-r-- 1 kblighe kblighe 35848 Apr 29 13:50 abundance.h5
-rw-rw-r-- 1 kblighe kblighe 591 Apr 29 13:50 abundance.tsv
-rw-rw-r-- 1 kblighe kblighe 728763 Apr 29 13:50 align.bam
-rwxr-xr-x 1 kblighe kblighe 17435774 Mar 20 2017 kallisto
-rw-r--r-- 1 kblighe kblighe 1366 Aug 1 2017 license.txt
-rw-r--r-- 1 kblighe kblighe 2253 Mar 20 2017 README.md
-rw-rw-r-- 1 kblighe kblighe 270 Apr 29 13:50 run_info.json
drwxr-xr-x 2 kblighe kblighe 4096 Apr 29 13:25 test
For you:
kallisto quant -i "${kallistoIdx}" -o "${outputFileLoc}"/"${sample}" --pseudobam \
-b 100 "${Read1}" "${Read2}" | samtools view -bS - > "${sample}".bam
Kevin
Which version of Kallisto are you using? I am using kallisto 0.44.0 and I tried to reproduce the issue you were mentioning and could not. I used the same command in OP (with input and output directories changed and added threads). Output included pseudoalignments.bam, as mentioned in the manual. I am on mint 18.3 (with ubuntu xenial base).
Input:
output:
I reproduced the problem with my version, 0.43.1
Perhaps the group updated it in 0.44
Following bash loop had no problem in output (pseudoalignments.bam was present in designated folder) :
and parallel:
input files :hcc1395_normal_rep1_r1.fastq.gz, hcc1395_normal_rep1_r1.fastq.gz.
output:
Only problem with the script is that it tries to create same directory twice because of mkdir ${i%_rep*} and complains that directory already exists, when it tries to create second time. Other than that, there is no issue with kallisto. Standalone, bash loop and parallel (all the scripts) produced pseudoalignments.bam.
@ OP: echo input and output files and directories.
Yes, there is no issue with Kallisto. Just different behaviour across different versions. It does not help that the developers contradict themselves on their own website, evidently not updating all of their documentation in accordance with newer version changes.
[source: https://pachterlab.github.io/kallisto/pseudobam.html]
...okay, let's go to the manual, as directed by the above statement:
------------------------
[source: https://pachterlab.github.io/kallisto/manual.html]
...ta-da!
OP's issue is that Kallisto outputs only three files instead of 4, missing pseudoalignments.bam and if bootstrapping somehow affects bam output. I am trying to convey (to OP) that with current version of kallisto and OP code, pseudoalignments.bam is generated with both standalone and loop scripts with bootstrapping. I have inquired about kallisto version used in script execution by OP.
Yes, and solved below.