Question

Per base sequence content failed miserably

0

Entering edit mode

14 days ago

Kai Xin • 0

Hi all,

I am working on 16S rRNA sequence of anaerobes enriched from wastewater treatment plant.

I was trying to do some genome binning but it was unsuccessful, so I went back to do some trimmomatic and QC, I found that although the sequence quality improved, my per base sequence content is always a fail, also the per tile sequence quality becomes worse than the untrimmed sequence file.

the code I ran was:

trimmomatic PE -threads 4 -phred 33 \ Raw160823_1.fastq.gz Raw160823_2.fastq.gz \ Raw160823_1_trimmed.fastq.gz Raw160823_1_trimmed_failed.fastq.gz Raw160823_2_trimmed.fastq.gz Raw160823_2_trimmed_failed.fastq.gz \ HEADCROP:10 TRAILING:10 SLIDINGWINDOW:4:30 MINLEN:80

I have tried changing the value of the parameters but the results were about the same. What do you all think? Do I have to make sure all the checks are passed before I proceed to assembly, or do I just care about the per base sequence quality?

Thank you, Kai

Raw sequence (read 2)

Read 2 trimmed

per tile sequence quality

per base sequence content after trimming

fastqc sequence trimmomatic NGS assembly • 618 views

ADD COMMENT • link 13 days ago by Kai Xin • 0

0

Entering edit mode

What kind of data is this? Plain genomic/amplicons? How you proceed will largely depend on the context. Can you also post the original fastqc plots for the data (pre-any manipulations).

ADD REPLY • link 14 days ago by GenoMax 141k

0

Entering edit mode

Hi, i have added the fastqc plots (untrimmed and trimmed). The sample was from anaerobes enriched from wastewater treatment plant, and this file is 16S rRNA sequencing.

ADD REPLY • link 13 days ago by Kai Xin • 0

0

Entering edit mode

If this is 16S sequencing just follow the standard workflows like Qiime2.https://docs.qiime2.org/2024.2/tutorials/overview/