I have a paired end fastq file and when I ran trim galore, the error said lengths of bases and qualities do not match, so I searched the solution and recommended to using bbtools reformat.sh to discard reads that have mismatching lengths of bases and qualities:
reformat.sh in=pair_1.fq.gz in2=pair_2.fq.gz out=fixed_1.fq.gz out2=fixed_2.fq.gz tossbrokenreads=t
The error
Set INTERLEAVED to false
Input is being processed as paired
pigz: abort: read error on pair_1.fq.gz (Input/output error)
java.lang.AssertionError:
There appear to be different numbers of reads in the paired input files.
The pairing may have been corrupted by an upstream process. It may be fixable by running repair.sh.
at stream.ConcurrentGenericReadInputStream.pair(ConcurrentGenericReadInputStream.java:497)
at stream.ConcurrentGenericReadInputStream.readLists(ConcurrentGenericReadInputStream.java:362)
at stream.ConcurrentGenericReadInputStream.run0(ConcurrentGenericReadInputStream.java:206)
at stream.ConcurrentGenericReadInputStream.run(ConcurrentGenericReadInputStream.java:182)
at java.lang.Thread.run(Thread.java:745)
so I tried to repair it by using bbtools repairs.sh
repair.sh in1=pair_1.fq.gz in2=pair_2.fq.gz out1=fixed_1.fq.gz out2=fixed_2.fq.gz outs=singletons.fq repair
Set INTERLEAVED to false
Started output stream.
pigz: abort: read error on pair_1.fq.gz (Input/output error)
java.lang.Exception:
Mismatch between length of bases and qualities for read 107893745 (id=ST-E00126:1085:HF3YVCCX2:1:2106:16620:58339 1:N:0:TAAGCTCC+AGATCTCG).
# qualities=24, # bases=150
AAFFFFJJJJJJJJJJJJJJJJJJ
GTGTAGGACATCCATTTTATCAAGTTTCTGCTACAAGAAATGAAAAAATGAGACACTTGATTACTACAGGCAGACCAACCAAAGTCTTTGTTCCACCTTTTAAAACTAAATCGCATTTTCACAGAGTTGAACAGTGTGTTAGGAATATTA
This can be bypassed with the flag 'tossbrokenreads' or 'nullifybrokenquality'
at shared.KillSwitch.kill(KillSwitch.java:96)
at stream.Read.validateQualityLength(Read.java:214)
at stream.Read.validate(Read.java:104)
at stream.Read.<init>(Read.java:76)
at stream.Read.<init>(Read.java:50)
at stream.FASTQ.quadToRead_slow(FASTQ.java:809)
at stream.FASTQ.toReadList(FASTQ.java:646)
at stream.FastqReadInputStream.fillBuffer(FastqReadInputStream.java:107)
at stream.FastqReadInputStream.nextList(FastqReadInputStream.java:93)
at stream.ConcurrentGenericReadInputStream$ReadThread.readLists(ConcurrentGenericReadInputStream.java:680)
at stream.ConcurrentGenericReadInputStream$ReadThread.run(ConcurrentGenericReadInputStream.java:656)
but the error said tossbrokenreads again, it kinda fall into a loop.
Anyone has experience this, please advice. Thanks.
I second this.
repair.sh
should be able to take care of the problem. Can you try?In "repaired" broken reads Q scores will be replaced with
?
.Thanks for the reply.
I think it is the download error but less likely I can re-download those data. I can repair the read 2 fastq file successfully which treated as single end so I think the error came from read1 fastq file.
Set INTERLEAVED to false Started output stream. pigz: abort: read error on pair_1.fq.gz (Input/output error) Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOfRange(Arrays.java:3520) at shared.KillSwitch.copyOfRange(KillSwitch.java:377) at fileIO.ByteFile1.nextLine(ByteFile1.java:174) at fileIO.ByteFile2$BF1Thread.run(ByteFile2.java:274) java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOfRange(Arrays.java:3664) at java.lang.String.<init>(String.java:207) at java.lang.String.substring(String.java:1969) at java.lang.String.subSequence(String.java:2003) at java.util.regex.Pattern.split(Pattern.java:1216) at java.lang.String.split(String.java:2380) at java.lang.String.split(String.java:2422) at jgi.SplitPairsAndSingles.repair(SplitPairsAndSingles.java:693) at jgi.SplitPairsAndSingles.process3_repair(SplitPairsAndSingles.java:518) at jgi.SplitPairsAndSingles.process2(SplitPairsAndSingles.java:284) at jgi.SplitPairsAndSingles.process(SplitPairsAndSingles.java:220) at jgi.SplitPairsAndSingles.main(SplitPairsAndSingles.java:37)
This program ran out of memory. Try increasing the -Xmx flag and using tool-specific memory-related parameters.
I already tried -Xmx32G but still error.
is there anything I can do or which step I should do first ? Thanks.