Entering edit mode
2.4 years ago
bioinformatics
▴
40
Hi,
I'm trying to map reads to a reference genome using kallisto for rna seq analysis with terminal on mac and the following command keeps loading for hours and won't run. I'm not exactly sure where I've gone wrong.
kallisto index -i Homo_sapiens.GRCh38.cdna.all.HBmain.fa.gz.index Homo_sapiens.GRCh38.cdna.all.HBmain.fa.gz.fa
[build] loading fasta file Homo_sapiens.GRCh38.cdna.all.HBmain.fa.gz.fa
[build] k-mer length: 31
[build] warning: clipped off poly-A tail (longer than 10)
from 1554 target sequences
[build] warning: replaced 100005 non-ACGUT characters in the input sequence
with pseudorandom nucleotides
[build] counting k-mers ... done.
[build] building target de Bruijn graph ...
The commands I used are listed below:
bash ~/Miniconda3-latest-MacOSX-x86_64.sh -b -p $HOME/miniconda
source $HOME/miniconda/bin/activate
conda init zsh
conda info
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
conda config --set offline false
conda create --name rnaseq
conda activate rnaseq
conda install -c bioconda kallisto
kallisto
conda install -c bioconda fastqc
conda install -c bioconda multiqc
conda activate rnaseq
kallisto index -i Homo_sapiens.GRCh38.cdna.all.HBmain.fa.gz.index
Homo_sapiens.GRCh38.cdna.all.HBmain.fa.gz.fa
Thankyou
Doesn't sound normal to me. It took me 4 or less minutes to build an index from this file.
I run this command on WSL2 with 32 GB of RAM and 8 core.
ps. I have installed kallisto with conda
Ok thanks for your help it is not loading and I have tried on two different macs.
I have also tried the command you have provided.
The extension on your fasta file looks strange:
Try removing the ".fa" on the end; the file should then be
Based on the extension, kallisto may be trying to process what it "thinks" is an uncompressed fasta, which could be causing errors. If that doesn't work, check that
Homo_sapiens.GRCh38.cdna.all.HBmain.fa.gz
isn't corrupted, check that it's truly a fasta file, check that it's truly gzipped, etc.The command has worked on my mac previously, so the file may not be corrupt. However, I tried to build the index again and it didn't work.
I tried to remove the ".fa" on the end and it still doesn't run.
It loaded! Thanks all for your help.
In case someone has the same problem in the future, what was the solution?
I'm not exactly sure, I used a different mac computer and it loaded in a couple of minutes. The disk on the previous laptop might be full.
I also used the following code to build the index:
kallisto index -i Homo_sapiens.GRCh38.cdna.all.HBmain.fa.gz.index Homo_sapiens.GRCh38.cdna.all.HBmain.fa.gz
I removed the extra 'fa'