Question

Humann3 (slowly) creating many bowtie2 index files in the temp dir

0

Entering edit mode

14 months ago

emi-smiley ▴ 10

Hi there,

I have been running Humann3 and all seems to have been working great until about a week ago (who knows what the naughty coding fairies must have done).

Humann 3 is running, but now takes super long and generates multiple Bowtie 2 index files. This is how I have been running the Humann portion of my batch jobs:

humann --protein-database /projects/emye7956/software/anaconda/envs/humann_env/uniref \
--nucleotide-database /projects/emye7956/software/anaconda/envs/humann_env/chocophlan/ \
--input "$fpathc" \
--output "$output_dir" -v && echo "ALL DONE WITH ${foutput} AT LAST :D"

The metaphlan databases I have are in /projects/emye7956/software/anaconda/envs/humann_env/lib/python3.7/site-packages/metaphlan/metaphlan_databases And look like this:

mpa_latest                              mpa_vOct22_CHOCOPhlAnSGB_202212.pkl
mpa_vOct22_CHOCOPhlAnSGB_202212.1.bt2l  mpa_vOct22_CHOCOPhlAnSGB_202212.rev.1.bt2l
mpa_vOct22_CHOCOPhlAnSGB_202212.2.bt2l  mpa_vOct22_CHOCOPhlAnSGB_202212.rev.2.bt2l
mpa_vOct22_CHOCOPhlAnSGB_202212.3.bt2l  mpa_vOct22_CHOCOPhlAnSGB_202212_VINFO.csv
mpa_vOct22_CHOCOPhlAnSGB_202212.4.bt2l  README.txt
mpa_vOct22_CHOCOPhlAnSGB_202212.fna

An example of an output temp dir for a file that ran to completion but took half a day looks like this (note the multiple bowtie2 index files that take long to run):

MG773_humann_temp:
MG773_bowtie2_aligned.sam
MG773_bowtie2_aligned.tsv
MG773_bowtie2_index.1.bt2
MG773_bowtie2_index.2.bt2
MG773_bowtie2_index.3.bt2
MG773_bowtie2_index.4.bt2
MG773_bowtie2_index.rev.1.bt2
MG773_bowtie2_index.rev.2.bt2
MG773_custom_chocophlan_database.ffn
MG773_cleancombined.log
MG773_metaphlan_bowtie2.txt
MG773_metaphlan_bugs_list.tsv

And my config file looks like this:

[database_folders]
nucleotide = data/chocophlan_DEMO
protein = data/uniref_DEMO
utility_mapping = data/misc

[run_modes]
resume = True
verbose = False
bypass_prescreen = False
bypass_nucleotide_index = False
bypass_nucleotide_search = False
bypass_translated_search = False
threads = 40

[alignment_settings]
evalue_threshold = 1.0
prescreen_threshold = 0.01
translated_subject_coverage_threshold = 50.0
translated_query_coverage_threshold = 90.0
nucleotide_subject_coverage_threshold = 50.0
nucleotide_query_coverage_threshold = 90.0

[output_format]
output_max_decimals = 10
remove_stratified_output = False
remove_column_description_output = False

Any help would be very much appreciated! Thanks so much in advance :D

Humann3 Bowtie2 Metaphlan Bowti2 Index • 527 views

ADD COMMENT • link 14 months ago by emi-smiley ▴ 10