Hello,
I am trying to run SingleM on data obtained using sratoolkit (2.10.7)
I used prefetch SRR2103020
to download and
fastq-dump --outdir ./fastq --split-e ./SRR2103020/SRR2103020.sra
to split.
when I check the file : head SRR2103020_1.fastq
@SRR2103020.1 D7RS0RN1:177:C12Y0ACXX:3:1101:1404:2079 length=93
AATGTGGACAGCGCCGTCTTCAAACAGGCGCTGTCCAGCTAGCAGCTCAACGCTCCGCGCCGCCGTCTTCGCCGTCTTCAGGCAGGGGGAGAA
+SRR2103020.1 D7RS0RN1:177:C12Y0ACXX:3:1101:1404:2079 length=93
@BCFFFDDHHHHHJJJGIJIBHHIJJCH?HBG<GIG@HEGIGFIFG@>CGHHHFFFDD@BB::BD@B??CDBDD@BD@DA@CCDDBD######
@SRR2103020.2 D7RS0RN1:177:C12Y0ACXX:3:1101:1440:2113 length=93
GGTATAAGTTCTATGTGTAATGAACCACAGAGTTATCAAAAAACTCAAGATCTGTCTCTTATACACATCTGACGCTGCCGACGAGCGATCTAG
+SRR2103020.2 D7RS0RN1:177:C12Y0ACXX:3:1101:1440:2113 length=93
@C?DDFFFHHHHHIIGIIIIJ<IHHIJJJIHIFFEIIIIJJIJIJJIIJIGHIJJJJIJJJGJJIJDHIJJJJJHHHFFFDDBD#########
@SRR2103020.3 D7RS0RN1:177:C12Y0ACXX:3:1101:1386:2229 length=93
GAATGAATCAAGGATGCTAAGTCTCCATCTACAAAATTATTTGTTTGAACAGATAAGTTTAACCGACTTTAAAGTCTATTCAGTTATCTACAC
I used a fastq repair tool (fastqwiper) using: fastqwiper --fastq_in SRR2103020_1.fastq --fastq_out SRR_test_wiped_1.fastq
and the file appears to be repaired:
@SRR2103020.20 D7RS0RN1:177:C12Y0ACXX:3:1101:3313:2100 length=93
GCTCAATTCCCACACTTGAACACTTTCAATACATTCATCCCAAATAGGTTTTGCTATCGGATTATTACCTGAAGCCAATTCTTTCAAACCTTC
+
CCCFFFFFGHHHHJJJJIIJIIIJJJGIJFIJEIJJJJJJJIJJGIIIBFGHHJEIIJJJGIJIIJIJJJJIHGHHEFBEFFFEDDECCDC<>
@SRR2103020.47 D7RS0RN1:177:C12Y0ACXX:3:1101:5867:2138 length=93
CCTTAATTCAAACTCAGTTCTACGGACAACAACTTCATGCCTGAAATCCACAAAATGAGTTAAAACATCTTTCAGGGGCATAATCTTTGGAAC
+
CCCFFFFFHHHHHJJIJHHJIHJJJJJJJJIIJJJJJIIJIIJGIJIJIIFHIDGFHIJHIIJJJJJJJJJJHHHHEFFDCEEEDCDDCDDCA
@SRR2103020.52 D7RS0RN1:177:C12Y0ACXX:3:1101:6059:2078 length=93
ATGCTCCTCCAACCATTACATCTGTTGAATTTGCAACTTGTACAATTAAACTTCCAGTTTTCGTAATTGAATTGAAAATTTCAAAAGTTGCAC
however, when I run singleM using: singlem pipe --forward ./SRR_test_wiped_1.fastq --otu_table singlem/sampe01_F_otu --threads 20
the job fails and I get error message:
/lib/python3.6/site-packages/singlem/data/S1.6.ribosomal_protein_L14b_L23e_rplN.gpkg.spkg/S1.6.ribosomal_protein_L14b_L23e_rplN/graftmAiWqW8_search.hmm -) | hmmsearch --domE 1e-05 --cpu 1 -o /dev/null --noali --domtblout /dev/shm/tmp43rja84p/graftm_protein_search/SRR_test_wiped_1_b/graftmIkMbHN_search_SRR_test_wiped_1_b.hmmout.txt /ibex/scratch/alamourt/conda_singlem_env/lib/python3.6/site-packages/singlem/data/S1.6.ribosomal_protein_L14b_L23e_rplN.gpkg.spkg/S1.6.ribosomal_protein_L14b_L23e_rplN/graftmIkMbHN_search.hmm - returned non-zero exit status 1.\nSTDERR was: b\'\\nError: Sequence file - is empty or misformatted\\n\\n\\nError: Sequence file - is empty or misformatted\\n\\n\\nError: Sequence file - is empty or misformatted\\n\\n\\nError: Sequence file - is empty or misformatted\\n\\n\\nError: Sequence file - is empty or misformatted\\n\\n\\nError: Sequence file - is empty or misformatted\\n\\n\\nError: Sequence file - is empty or misformatted\\n\\n\\nError: Sequence file - is empty or misformatted\\n\\n\\nError: Sequence file - is empty or misformatted\\n\\n\\nError: Sequence file - is empty or misformatted\\n\\n\\nError: Sequence file - is empty or misformatted\\n\\n\\nError: Sequence file - is empty or misformatted\\n\\n\\nError: Sequence file - is empty or misformatted\\n\\n\\nError: Sequence file - is empty or misformatted\\n\\n\\nError: Sequence file - is empty or misformatted\\n\\n\\nError: Sequence file - is empty or misformatted\\n\\n\\nError: Sequence file - is empty or misformatted\\n\\n\\nError: Sequence file - is empty or misformatted\\n\\n\\nError: Sequence file - is empty or misformatted\\n\\n\\nError: Sequence file - is empty or misformatted\\n\\n\'STDOUT was: b\'\'\n'STDOUT was: b''
can you please assist, is there anything wrong in my workflow? or is there a better tool to repair fastq files? please keep in mind that I ran the same SingleM package on another metagenome and it works properly. the job did not finish yet but it has been running for 45 minutes while using this dataset, the job fails after one minute. so I am assuming its SRA related issue. thank you