Trinotate generates an empty report, only headers and no data inside
1
0
Entering edit mode
4 months ago
jway • 0

Hello,

I've been running the most recent Dockerized version of Trinotate. I completed steps 2, 3, and 4 as seen in https://github.com/Trinotate/Trinotate/wiki/Software-installation-and-data-required, and while I was running the sequence analyses and database searches, I got confirmations that the results were loaded into the SQLite database (see in trace below).

When I tried to generate the report, however, I got a blank .tsv file with only the headers. I looked through the SQLite file to make sure that my annotations worked, and they are showing up. No error comes up when I generate the report, it's just blank. Is there anything that I'm doing wrong? Any guidance would be greatly appreciated.

Part of the sequence analysis/database search trace

* [Tue Jul  2 18:04:19 2024] Running CMD: diamond blastx -d /data/NEWTRINITYDATADIR/uniprot_sprot -q /data/trinity_out_dir_sortednew.Trinity.fasta -p 32 -k 1 -e 1e-5 -o uniprot_sprot.diamond.blastx.outfmt6 --outfmt 6
diamond v2.0.15.153 (C) Max Planck Society for the Advancement of Science
Documentation, support and updates available at http://www.diamondsearch.org
Please cite: http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021)

#CPU threads: 32
Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
Temporary directory:
#Target sequences to report alignments for: 1
Opening the database...  [0.048s]
Database: /data/NEWTRINITYDATADIR/uniprot_sprot.dmnd (type: Diamond database, sequences: 571609, letters: 206878625)
Block size = 2000000000
Opening the input file...  [0.058s]
Opening the output file...  [0s]
Loading query sequences...  [3.501s]
Masking queries...  [0.687s]
Algorithm: Double-indexed
Building query histograms...  [0.779s]
Allocating buffers...  [0s]
Loading reference sequences...  [0.182s]
Masking reference...  [0.241s]
Initializing temporary storage...  [0s]
Building reference histograms...  [0.295s]
Allocating buffers...  [0s]
Processing query block 1, reference block 1/1, shape 1/2, index chunk 1/4.
Building reference seed array...  [0.167s]
Building query seed array...  [0.381s]
Computing hash join...  [0.133s]
Masking low complexity seeds...  [0.037s]
Searching alignments...  [0.531s]
Processing query block 1, reference block 1/1, shape 1/2, index chunk 2/4.
Building reference seed array...  [0.139s]
Building query seed array...  [0.257s]
Computing hash join...  [0.118s]
Masking low complexity seeds...  [0.038s]
Searching alignments...  [0.438s]
Processing query block 1, reference block 1/1, shape 1/2, index chunk 3/4.
Building reference seed array...  [0.144s]
Building query seed array...  [0.279s]
Computing hash join...  [0.117s]
Masking low complexity seeds...  [0.038s]
Searching alignments...  [0.36s]
Processing query block 1, reference block 1/1, shape 1/2, index chunk 4/4.
Building reference seed array...  [0.127s]
Building query seed array...  [0.304s]
Computing hash join...  [0.116s]
Masking low complexity seeds...  [0.037s]
Searching alignments...  [0.323s]
Processing query block 1, reference block 1/1, shape 2/2, index chunk 1/4.
Building reference seed array...  [0.121s]
Building query seed array...  [0.298s]
Computing hash join...  [0.113s]
Masking low complexity seeds...  [0.038s]
Searching alignments...  [0.334s]
Processing query block 1, reference block 1/1, shape 2/2, index chunk 2/4.
Building reference seed array...  [0.141s]
Building query seed array...  [0.332s]
Computing hash join...  [0.113s]
Masking low complexity seeds...  [0.038s]
Searching alignments...  [0.35s]
Processing query block 1, reference block 1/1, shape 2/2, index chunk 3/4.
Building reference seed array...  [0.145s]
Building query seed array...  [0.348s]
Computing hash join...  [0.114s]
Masking low complexity seeds...  [0.038s]
Searching alignments...  [0.286s]
Processing query block 1, reference block 1/1, shape 2/2, index chunk 4/4.
Building reference seed array...  [0.129s]
Building query seed array...  [0.304s]
Computing hash join...  [0.114s]
Masking low complexity seeds...  [0.038s]
Searching alignments...  [0.283s]
Deallocating buffers...  [0s]
Clearing query masking...  [0.052s]
Computing alignments...  [12.05s]
Deallocating reference...  [0s]
Loading reference sequences...  [0s]
Deallocating buffers...  [0.001s]
Deallocating queries...  [0.001s]
Loading query sequences...  [0s]
Closing the input file...  [0s]
Closing the output file...  [0.012s]
Cleaning up...  [0s]
Total time = 25.781s
Reported 198395 pairwise alignments, 198395 HSPs.
198395 queries aligned.
* [Tue Jul  2 18:04:45 2024] Running CMD: /usr/local/src/Trinotate/Trinotate --db /data/NEWTRINITYDATADIR/TrinotateBoilerplate.sqlite --LOAD_swissprot_blastx uniprot_sprot.diamond.blastx.outfmt6
-LOADING as per --LOAD_swissprot_blastx
CMD: /usr/local/src/Trinotate/util/trinotateSeqLoader/Trinotate_BLAST_loader.pl --sqlite /data/NEWTRINITYDATADIR/TrinotateBoilerplate.sqlite --outfmt6 uniprot_sprot.diamond.blastx.outfmt6 --prog blastx --dbtype Swissprot
CMD: echo "pragma journal_mode=memory;
pragma synchronous=0;
pragma cache_size=4000000;
.mode tabs
.import tmp.blast_bulk_load.3745 BlastDbase" | sqlite3 /data/NEWTRINITYDATADIR/TrinotateBoilerplate.sqlite
memory


BlastDbase loading complete..

* [Tue Jul  2 18:04:47 2024] Running CMD: hmmsearch --cpu 32 --noali --domtblout TrinotatePFAM.out /data/NEWTRINITYDATADIR/Pfam-A.hmm /data/transdecoder_data/trinity_out_dir_sortednew.Trinity.fasta.transdecoder.pep > pfam.log
* [Tue Jul  2 21:42:28 2024] Running CMD: /usr/local/src/Trinotate/Trinotate --db /data/NEWTRINITYDATADIR/TrinotateBoilerplate.sqlite --LOAD_pfam  TrinotatePFAM.out
-LOADING as per --LOAD_pfam
CMD: /usr/local/src/Trinotate/util/trinotateSeqLoader/Trinotate_PFAM_loader.pl --sqlite /data/NEWTRINITYDATADIR/TrinotateBoilerplate.sqlite --pfam TrinotatePFAM.out
CMD: echo "pragma journal_mode=memory;
pragma synchronous=0;
pragma cache_size=4000000;
.mode tabs
.import tmp.pfam_bulk_load.707083 HMMERDbase" | sqlite3 /data/NEWTRINITYDATADIR/TrinotateBoilerplate.sqlite
memory


Loading complete..

* [Tue Jul  2 21:42:42 2024] Running CMD: cmscan -Z 323 --cut_ga --rfam --nohmmonly --tblout infernal.out --fmt 2 --cpu 32 --clanin /data/NEWTRINITYDATADIR/Rfam.clanin /data/NEWTRINITYDATADIR/Rfam.cm /data/trinity_out_dir_sortednew.Trinity.fasta > infernal.log
* [Wed Jul  3 17:00:36 2024] Running CMD: /usr/local/src/Trinotate/Trinotate --db /data/NEWTRINITYDATADIR/TrinotateBoilerplate.sqlite --LOAD_infernal infernal.out
-LOADING as per --LOAD_infernal
CMD: /usr/local/src/Trinotate/util/trinotateSeqLoader/Trinotate_Infernal_loader.pl --sqlite /data/NEWTRINITYDATADIR/TrinotateBoilerplate.sqlite --infernal infernal.out
CMD: echo "pragma journal_mode=memory;
pragma synchronous=0;
pragma cache_size=4000000;
.mode tabs
.import tmp.infernal_bulk_load.3198127 Infernal" | sqlite3 /data/NEWTRINITYDATADIR/TrinotateBoilerplate.sqlite
memory


Loading complete..

Command to generate report, no errors appeared

root@4c6df2aa90d4:/data# $TRINOTATE_HOME/Trinotate --db /data/NEWTRINITYDATADIR/TrinotateBoilerplate.sqlite --report > myTrinotateattempt3.tsv
-REPORT being generated.
CMD: /usr/local/src/Trinotate/util/Trinotate_report_writer.pl --sqlite /data/NEWTRINITYDATADIR/TrinotateBoilerplate.sqlite -E 1e-5 --pfam_cutoff DNC

Opened up some of the entries in the SQLite file to make sure that everything was annotated correctly

root@4c6df2aa90d4:/data# sqlite3 /data/NEWTRINITYDATADIR/TrinotateBoilerplate.sqlite
SQLite version 3.31.1 2020-01-27 19:55:54
Enter ".help" for usage hints.
sqlite> .tables
BlastDbase           ORF                  UniprotIndex
Diff_expression      PFAMreference        eggNOGIndex
EggnogMapper         RNAMMERdata          go
ExprClusterAnalyses  Replicates           go_slim
ExprClusters         Samples              go_slim_mapping
Expression           SignalP              pfam2go
HMMERDbase           TaxonomyIndex        tmhmm
Infernal             Transcript
sqlite> SELECT * FROM EggnogMapper LIMIT 10;
TRINITY_DN0_c0_g1_i1.p2|760192.Halhy_5172|2.41e-32|132.0|COG3210@1|root,COG3210@2|Bacteria,4PMJA@976|Bacteroidetes|976|Bacteroidetes|U|domain, Protein|-|-|-|-|-|-|-|-|-|-|-|-|CHU_C
TRINITY_DN0_c0_g1_i10.p2|335543.Sfum_2553|5.35e-07|57.0|COG4966@1|root,COG4966@2|Bacteria,1Q8WE@1224|Proteobacteria,439J3@68525|delta/epsilon subdivisions,2X4VK@28221|Deltaproteobacteria,2MSHQ@213462|Syntrophobacterales|28221|Deltaproteobacteria|NU|Pfam:N_methyl_2|-|-|-|-|-|-|-|-|-|-|-|-|N_methyl
TRINITY_DN0_c0_g1_i10.p1|243090.RB11769|4.25e-32|134.0|COG2911@1|root,COG2911@2|Bacteria|2|Bacteria|S|protein secretion|-|-|-|ko:K20276|ko02024,map02024|-|-|-|ko00000,ko00001|-|-|-|CHU_C,Calx-beta,F5_F8_type_C,He_PIG,SLH
TRINITY_DN0_c0_g1_i10.p6|1232437.KL662002_gene4685|7.65e-09|58.5|COG4967@1|root,COG4967@2|Bacteria,1NI49@1224|Proteobacteria,42X6W@68525|delta/epsilon subdivisions,2WSY4@28221|Deltaproteobacteria,2MMCT@213118|Desulfobacterales|28221|Deltaproteobacteria|NU|Prokaryotic N-terminal methylation motif|-|-|-|ko:K02458,ko:K02671|ko03070,ko05111,map03070,map05111|M00331|-|-|ko00000,ko00001,ko00002,ko02035,ko02044|3.A.15|-|-|N_methyl
TRINITY_DN0_c0_g1_i10.p4|1232437.KL662002_gene4687|1.31e-12|70.5|COG4970@1|root,COG4970@2|Bacteria,1N85U@1224|Proteobacteria,42WBU@68525|delta/epsilon subdivisions,2WSG1@28221|Deltaproteobacteria,2MMBS@213118|Desulfobacterales|28221|Deltaproteobacteria|NU|Prokaryotic N-terminal methylation motif|-|-|-|ko:K08084|-|-|-|-|ko00000,ko02044|3.A.15.2|-|-|GspH,N_methyl
TRINITY_DN0_c0_g1_i11.p6|234267.Acid_5017|4.69e-77|237.0|COG3383@1|root,COG3383@2|Bacteria,3Y2T6@57723|Acidobacteria|2|Bacteria|C|PFAM NADH ubiquinone oxidoreductase, subunit G, iron-sulphur binding|hoxU|-|1.17.1.10,1.6.5.3|ko:K05299,ko:K05588|ko00190,ko00680,ko00720,ko01100,ko01120,ko01200,map00190,map00680,map00720,map01100,map01120,map01200|M00377|R00134,R11945|RC00061,RC02796|ko00000,ko00001,ko00002,ko01000|-|-|iJN678.hoxU|Fer2_4,Fer4,Fer4_10,Fer4_6,Fer4_7,Fer4_9,Molybdop_Fe4S4,NADH-G_4Fe-4S_3
TRINITY_DN0_c0_g1_i11.p4|926569.ANT_12520|2.77e-185|531.0|COG1894@1|root,COG1894@2|Bacteria,2G5K9@200795|Chloroflexi|200795|Chloroflexi|C|PFAM Respiratory-chain NADH dehydrogenase domain, 51 kDa subunit|hoxF|-|1.6.5.3|ko:K00335,ko:K05587|ko00190,ko01100,map00190,map01100|M00144|R11945|RC00061|ko00000,ko00001,ko00002,ko01000|3.D.1|-|-|2Fe-2S_thioredx,Complex1_51K,NADH_4Fe-4S,SLBB
TRINITY_DN0_c0_g1_i11.p3|926569.ANT_12550|8.23e-203|582.0|COG3259@1|root,COG3259@2|Bacteria,2G68W@200795|Chloroflexi|200795|Chloroflexi|C|PFAM nickel-dependent hydrogenase, large subunit|-|-|1.12.1.2,1.8.98.5|ko:K00436,ko:K14126|ko00680,map00680|-|R00019,R00700,R11943|RC00011|ko00000,ko00001,ko01000|-|-|-|NiFeSe_Hases
TRINITY_DN0_c0_g1_i11.p1|439235.Dalk_4404|1.61e-193|644.0|COG3419@1|root,COG3506@1|root,COG3419@2|Bacteria,COG3506@2|Bacteria,1NUAV@1224|Proteobacteria,42NJ7@68525|delta/epsilon subdivisions,2WKJV@28221|Deltaproteobacteria,2MHMF@213118|Desulfobacterales|28221|Deltaproteobacteria|NU|Tfp pilus assembly protein tip-associated adhesin|pilY1|-|-|ko:K02674|-|-|-|-|ko00000,ko02035,ko02044|-|-|-|Neisseria_PilC,PA14
TRINITY_DN0_c0_g1_i2.p4|1232437.KL662002_gene4687|6.28e-15|76.6|COG4970@1|root,COG4970@2|Bacteria,1N85U@1224|Proteobacteria,42WBU@68525|delta/epsilon subdivisions,2WSG1@28221|Deltaproteobacteria,2MMBS@213118|Desulfobacterales|28221|Deltaproteobacteria|NU|Prokaryotic N-terminal methylation motif|-|-|-|ko:K08084|-|-|-|-|ko00000,ko02044|3.A.15.2|-|-|GspH,N_methyl
sqlite> SELECT * FROM HMMERDbase LIMIT 10;
TRINITY_DN239_c0_g1_i8.p3|PF10417.14|1-cysPrx_C|NULL|165.0|204.0|1.0|41.0|2.1e-16|4.8e-16|63.5|62.3
TRINITY_DN196518_c0_g1_i1.p1|PF10417.14|1-cysPrx_C|NULL|173.0|211.0|1.0|40.0|1.0e-15|1.7e-15|61.2|60.5
TRINITY_DN12898_c0_g1_i6.p1|PF10417.14|1-cysPrx_C|NULL|55.0|91.0|1.0|38.0|1.1e-15|2.2e-15|61.1|60.2
TRINITY_DN75604_c0_g1_i1.p1|PF10417.14|1-cysPrx_C|NULL|170.0|209.0|1.0|41.0|1.7e-15|3.4e-15|60.5|59.6
TRINITY_DN12898_c0_g1_i2.p1|PF10417.14|1-cysPrx_C|NULL|79.0|115.0|1.0|38.0|1.8e-15|3.2e-15|60.5|59.7
TRINITY_DN12898_c0_g1_i3.p1|PF10417.14|1-cysPrx_C|NULL|217.0|253.0|1.0|38.0|5.7e-15|1.0e-14|58.9|58.1
TRINITY_DN3190_c0_g1_i3.p2|PF10417.14|1-cysPrx_C|NULL|169.0|201.0|1.0|34.0|2.7e-14|5.3e-14|56.7|55.8
TRINITY_DN371771_c0_g1_i1.p1|PF10417.14|1-cysPrx_C|NULL|73.0|104.0|1.0|33.0|5.3e-14|8.1e-14|55.8|55.2
TRINITY_DN3190_c0_g1_i1.p3|PF10417.14|1-cysPrx_C|NULL|169.0|201.0|1.0|34.0|1.0e-13|2.0e-13|54.9|53.9
TRINITY_DN3190_c0_g1_i6.p3|PF10417.14|1-cysPrx_C|NULL|169.0|201.0|1.0|34.0|1.0e-13|2.0e-13|54.9|53.9
sqlite> SELECT * FROM PFAMreference LIMIT 10;
PF10417.14|1-cysPrx_C|C-terminal domain of 1-Cys peroxiredoxin|21.1|21.1|21.1|21.1|21.0|21.0
PF21734.2|10_blade|10-bladed beta propeller domain|27.0|27.0|28.0|28.3|25.4|22.6
PF21578.2|117-like_vir|Virus, 117-like|27.0|27.0|28.6|27.1|26.3|26.0
PF12574.13|120_Rick_ant|120 KDa Rickettsia surface antigen|25.0|25.0|39.6|64.6|23.6|21.2
PF09847.14|12TM_1|Membrane protein of 12 TMs|33.2|33.2|33.3|33.3|31.7|33.0
PF00244.25|14-3-3|14-3-3 protein|33.2|33.2|33.2|33.2|33.1|33.1
PF16998.10|17kDa_Anti_2|17 kDa outer membrane surface antigen|26.6|26.6|26.6|26.6|26.5|26.5
PF00389.35|2-Hacid_dh|D-isomer specific 2-hydroxyacid dehydrogenase, catalytic domain|24.6|24.6|24.6|24.6|24.5|24.5
PF02826.24|2-Hacid_dh_C|D-isomer specific 2-hydroxyacid dehydrogenase, NAD binding domain|25.1|25.1|25.1|25.1|25.0|25.0
PF00198.28|2-oxoacid_dh|2-oxoacid dehydrogenases acyltransferase (catalytic domain)|23.0|23.0|23.0|23.0|22.9|22.9
sqlite> .exit
trinotate metatranscriptome docker • 413 views
ADD COMMENT
0
Entering edit mode
4 months ago

Jway,

Try first re-run the init command prior to exporting results

Trinotate --db <sqlite.db> --init --gene_trans_map <file> --transcript_fasta <file> --transdecoder_pep <file>

If the checkpoint file __trinotate_run_checkpts/*__init.ok was written you may need to remove

Then

Trinotate --db <sqlite.db> --report --incl_pep > Trinotate.xls

As gene_trans_map is a mandatory file, you can create one as follows example:

grep "^>" assembly.fasta | awk '{gsub(/_i[0-9]$|>/,""); print}' > genes

grep "^>" assembly.fasta | awk '{gsub(/>/,""); print}' > trans

paste -d"\t" genes trans > gen_tr_map

the gen_tr_map file will contain transcript and gene list strings separated by a delimiter

Ex:

transcript string: NODE_1_length_21868_cov_160.403166_g0_i0

gene string: NODE_1_length_21868_cov_160.403166_g0

A similar issue was posted here:

https://github.com/Trinotate/Trinotate.github.io/issues/70

Regards

ADD COMMENT

Login before adding your answer.

Traffic: 2014 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6