Salmon index issue
0
0
Entering edit mode
8 months ago
Andrea ▴ 10

I'm trying to construct the reference for Salmon, but the process stopped with this error:

the commandis:

salmon index -t CRCH38_and_decoys.fa.gz -d decoys.txt -i GRCh38_salmon_index --gencode

the last part of the error is:

[2024-02-19 12:14:55.227] [puff::index::jointLog] [warning] Entry with header [ENST00000634174.1|ENSG00000282732.1|OTTHUMG00000191398.1|OTTHUMT00000487783.1|ENST00000634174|ENSG00000282732|28|unprocessed_pseudogene|], had length less than equal to the k-mer length of 31 (perhaps after poly-A clipping)

[2024-02-19 12:15:56.790] [puff::index::jointLog] [warning] Removed 882 transcripts that were sequence duplicates of indexed transcripts. [2024-02-19 12:15:56.792] [puff::index::jointLog] [warning] If you wish to retain duplicate transcripts, please use the --keepDuplicates flag [2024-02-19 12:15:56.919] [puff::index::jointLog] [info] Replaced 151122967 non-ATCG nucleotides [2024-02-19 12:15:56.919] [puff::index::jointLog] [info] Clipped poly-A tails from 2034 transcripts Killed

it could be a problem of CPU memory?

Thanks

Salmon RNAseq • 638 views
ADD COMMENT
0
Entering edit mode

How much memory is available?

ADD REPLY
0
Entering edit mode

I have a laptop with 16 Gb

ADD REPLY
0
Entering edit mode

I just checked my logs, and a full genome-decoyed GRCh38 with Ensembl 101 annotations took 15GB to create. Possible that you simply do not have enough, since also other applications need some memory on your machine.

ADD REPLY
0
Entering edit mode

you can download prebuilt indices with refgenie

https://refgenie.databio.org/en/latest/

ADD REPLY

Login before adding your answer.

Traffic: 2107 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6