I am reproducing a colleague's genome assembly work using a5_pipeline.pl - downloaded from https://storage.googleapis.com/google-code-archive-downloads/v2/code.google.com/ngopt/ngopt_a5pipeline_linux-x64_20130326.tar.gz
The software-version is ngopt_a5pipeline_linux-x64_20130326.tar.gz
But there is a run error, that reads like so:
Processing /home/aksrao/FUSARIUM/Compare_Assemblers/Trimming/Trimmomatic/Trimmomatic_PCadapters_EthFoc11_1P
Error: read J00113:201:HCK5TBBXX:7:1101:32096:1191 has out of range quality values. Expected phred64. Quality string: ""''''""'++"+++"+'++++++++++++'++++'+++++''++"++++++"+++++++''++++'++++++'++++'+'++++""+''"'''''+'""'+'+ Check your data and re-run preprocess with the correct quality scaling flag. [a5] Error preprocessing reads with SGA
This error related to expectation of phred64, but feeding inputs with phred33, seems to have been solved, as shown by contents in this link: https://code.google.com/archive/p/ngopt/issues/5
Does anyone know whether this solution works, OR if there are better solutions to getting a5_pipeline.pl to work? Not using this software is NOT an option, since I am working on reproducibility. THANKS!
What is your intention? Do you intend to run the same dataset with the same software version and same parameters? Did your friend tell you the exact command-lines used? How come your friend could run the same dataset with the same software?
If you want hard-core reproducibility, you have to consider the operating system, perl and possible perl some perl modules.
I have access to my colleague's prior used syntax...but I am not aware of this run error or any work arounds for phred encoding constraints of a5_pipeline.pl. Point on reproducibility depending on various things accepted, but I am looking here exclusively for a technical solution to my phred encoding problem with a5_pipeline.pl. Rest comes later :) Thanks!