paired-end fastq file remove mismatching lengths of bases and qualities error
1
0
Entering edit mode
3.9 years ago
Peter Chung ▴ 210

I have a paired end fastq file and when I ran trim galore, the error said lengths of bases and qualities do not match, so I searched the solution and recommended to using bbtools reformat.sh to discard reads that have mismatching lengths of bases and qualities:

reformat.sh in=pair_1.fq.gz in2=pair_2.fq.gz out=fixed_1.fq.gz out2=fixed_2.fq.gz tossbrokenreads=t

The error

Set INTERLEAVED to false
Input is being processed as paired
pigz: abort: read error on pair_1.fq.gz (Input/output error)
java.lang.AssertionError: 
There appear to be different numbers of reads in the paired input files.
The pairing may have been corrupted by an upstream process.  It may be fixable by running repair.sh.
    at stream.ConcurrentGenericReadInputStream.pair(ConcurrentGenericReadInputStream.java:497)
    at stream.ConcurrentGenericReadInputStream.readLists(ConcurrentGenericReadInputStream.java:362)
    at stream.ConcurrentGenericReadInputStream.run0(ConcurrentGenericReadInputStream.java:206)
    at stream.ConcurrentGenericReadInputStream.run(ConcurrentGenericReadInputStream.java:182)
    at java.lang.Thread.run(Thread.java:745)

so I tried to repair it by using bbtools repairs.sh

repair.sh in1=pair_1.fq.gz in2=pair_2.fq.gz out1=fixed_1.fq.gz out2=fixed_2.fq.gz outs=singletons.fq repair

Set INTERLEAVED to false
Started output stream.
pigz: abort: read error on pair_1.fq.gz (Input/output error)
java.lang.Exception: 
Mismatch between length of bases and qualities for read 107893745 (id=ST-E00126:1085:HF3YVCCX2:1:2106:16620:58339 1:N:0:TAAGCTCC+AGATCTCG).
# qualities=24, # bases=150

AAFFFFJJJJJJJJJJJJJJJJJJ
GTGTAGGACATCCATTTTATCAAGTTTCTGCTACAAGAAATGAAAAAATGAGACACTTGATTACTACAGGCAGACCAACCAAAGTCTTTGTTCCACCTTTTAAAACTAAATCGCATTTTCACAGAGTTGAACAGTGTGTTAGGAATATTA

This can be bypassed with the flag 'tossbrokenreads' or 'nullifybrokenquality'
    at shared.KillSwitch.kill(KillSwitch.java:96)
    at stream.Read.validateQualityLength(Read.java:214)
    at stream.Read.validate(Read.java:104)
    at stream.Read.<init>(Read.java:76)
    at stream.Read.<init>(Read.java:50)
    at stream.FASTQ.quadToRead_slow(FASTQ.java:809)
    at stream.FASTQ.toReadList(FASTQ.java:646)
    at stream.FastqReadInputStream.fillBuffer(FastqReadInputStream.java:107)
    at stream.FastqReadInputStream.nextList(FastqReadInputStream.java:93)
    at stream.ConcurrentGenericReadInputStream$ReadThread.readLists(ConcurrentGenericReadInputStream.java:680)
    at stream.ConcurrentGenericReadInputStream$ReadThread.run(ConcurrentGenericReadInputStream.java:656)

but the error said tossbrokenreads again, it kinda fall into a loop.

Anyone has experience this, please advice. Thanks.

bbtools fastq reformat repair • 2.6k views
ADD COMMENT
0
Entering edit mode
3.9 years ago

You have multiple different errors, note how the first error says:

  • There appear to be different numbers of reads in the paired input files.

whereas the second error is

  • Mismatch between length of bases and qualities for read 107893745

your data seems to have multiple, overlapping problems

In addition, when you ran the repair.sh you did not toss the broken reads.

ADD COMMENT
0
Entering edit mode

I second this. repair.sh should be able to take care of the problem. Can you try?

repair.sh in1=pair_1.fq.gz in2=pair_2.fq.gz out1=fixed_1.fq.gz out2=fixed_2.fq.gz outs=singletons.fq repair tossbrokenreads=t

In "repaired" broken reads Q scores will be replaced with ?.

@NB511934:132:HPTKHPGX2:1:11101:1446:1079
CGAGCNCGTAAGGATTTTTCAGTG
+
?????!??????????????????
ADD REPLY
0
Entering edit mode

Thanks for the reply.

I think it is the download error but less likely I can re-download those data. I can repair the read 2 fastq file successfully which treated as single end so I think the error came from read1 fastq file.

Set INTERLEAVED to false Started output stream. pigz: abort: read error on pair_1.fq.gz (Input/output error) Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOfRange(Arrays.java:3520) at shared.KillSwitch.copyOfRange(KillSwitch.java:377) at fileIO.ByteFile1.nextLine(ByteFile1.java:174) at fileIO.ByteFile2$BF1Thread.run(ByteFile2.java:274) java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOfRange(Arrays.java:3664) at java.lang.String.<init>(String.java:207) at java.lang.String.substring(String.java:1969) at java.lang.String.subSequence(String.java:2003) at java.util.regex.Pattern.split(Pattern.java:1216) at java.lang.String.split(String.java:2380) at java.lang.String.split(String.java:2422) at jgi.SplitPairsAndSingles.repair(SplitPairsAndSingles.java:693) at jgi.SplitPairsAndSingles.process3_repair(SplitPairsAndSingles.java:518) at jgi.SplitPairsAndSingles.process2(SplitPairsAndSingles.java:284) at jgi.SplitPairsAndSingles.process(SplitPairsAndSingles.java:220) at jgi.SplitPairsAndSingles.main(SplitPairsAndSingles.java:37)

This program ran out of memory. Try increasing the -Xmx flag and using tool-specific memory-related parameters.

I already tried -Xmx32G but still error.

is there anything I can do or which step I should do first ? Thanks.

ADD REPLY

Login before adding your answer.

Traffic: 2930 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6