Question

invalid deflate data (invalid code lengths set)

0

Entering edit mode

19 months ago

leonmcswain ▴ 10

I am trying to trim paired end reads using Trim-Galore. I have made sure that the files match based on the total reads processed in the output txt file from trim-galore. One of the files trimmed correctly but when I try some of the others the total written and quality trimmed do not match. I get the following error when trim galore tries to validate the files:

pigz: skipping: /Volumes/Backup_Plus/RNAseq/Trimmed/Sample02.R2_trimmed.fq.gz: corrupted -- invalid deflate data (invalid code lengths set)
Read 2 output is truncated at sequence count: 19058781, please check your paired-end input files! Terminating...

Here is the txt report for read 1 and 2:

Read 1

SUMMARISING RUN PARAMETERS
==========================
Input filename: Sample02.R1.fastq.gz
Trimming mode: paired-end
Trim Galore version: 0.6.10
Cutadapt version: 4.4
Python version: 3.11.4
Number of cores used for trimming: 8
Quality Phred score cutoff: 20
Quality encoding type selected: ASCII+33
Using Illumina adapter for trimming (count: 25126). Second best hit was smallRNA (count: 9)
Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected)
Maximum trimming error rate: 0.1 (default)
Minimum required adapter overlap (stringency): 1 bp
Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp
Output file will be GZIP compressed


This is cutadapt 4.4 with Python 3.11.4
Command line parameters: -j 8 -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Sample02.R1.fastq.gz
Processing single-end reads on 8 cores ...
Finished in 397.910 s (4.251 µs/read; 14.11 M reads/minute).

=== Summary ===

Total reads processed:              93,605,182
Reads with adapters:                32,888,101 (35.1%)
Reads written (passing filters):    93,605,182 (100.0%)

Total basepairs processed: 9,454,123,382 bp
Quality-trimmed:               6,074,955 bp (0.1%)
Total written (filtered):  9,344,167,021 bp (98.8%)

=== Adapter 1 ===

Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 32888101 times

Minimum overlap: 1
No. of allowed errors:
1-9 bp: 0; 10-13 bp: 1

Bases preceding removed adapters:
  A: 26.5%
  C: 34.0%
  G: 22.3%
  T: 17.1%
  none/other: 0.0%

Read 2

SUMMARISING RUN PARAMETERS
==========================
Input filename: Sample02.R2.fastq.gz
Trimming mode: paired-end
Trim Galore version: 0.6.10
Cutadapt version: 4.4
Python version: 3.11.4
Number of cores used for trimming: 8
Quality Phred score cutoff: 20
Quality encoding type selected: ASCII+33
Using Illumina adapter for trimming (count: 25126). Second best hit was smallRNA (count: 9)
Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected)
Maximum trimming error rate: 0.1 (default)
Minimum required adapter overlap (stringency): 1 bp
Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp
Output file will be GZIP compressed


This is cutadapt 4.4 with Python 3.11.4
Command line parameters: -j 8 -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC Sample02.R2.fastq.gz
Processing single-end reads on 8 cores ...
Finished in 410.082 s (4.381 µs/read; 13.70 M reads/minute).

=== Summary ===

Total reads processed:              93,605,182
Reads with adapters:                34,396,551 (36.7%)
Reads written (passing filters):    93,605,182 (100.0%)

Total basepairs processed: 9,454,123,382 bp
Quality-trimmed:              16,184,037 bp (0.2%)
Total written (filtered):  9,331,568,873 bp (98.7%)

=== Adapter 1 ===

Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 34396551 times

Minimum overlap: 1
No. of allowed errors:
1-9 bp: 0; 10-13 bp: 1

Bases preceding removed adapters:
  A: 29.5%
  C: 30.6%
  G: 25.7%
  T: 14.2%
  none/other: 0.0%

Is this an issue with the quality of the sequencing? Can I override this or is it something else?

Thank you,
Leon

Paired-End Trimming Fastq TrimGalore • 1.3k views

ADD COMMENT • link updated 19 months ago by Ram 45k • written 19 months ago by leonmcswain ▴ 10

0

Entering edit mode

I just checked the files that succeeded with trimming and the quality trimmed and written on those don't match either but they still passed and were validated so not sure if the lack of matching after trimming has anything to do with the error messages.

ADD REPLY • link 19 months ago by leonmcswain ▴ 10

score 0 · Answer 1 · 2024-01-20

0

Entering edit mode

19 months ago

leonmcswain ▴ 10

Final Update, when the files were moved from the external hard drive to my desktop and processed from there the error was resolved. Not sure why accessing from external HD would cause this issue. Would appreciate any feedback.

ADD COMMENT • link 19 months ago by leonmcswain ▴ 10