Hi all,
I am working on 16S rRNA sequence of anaerobes enriched from wastewater treatment plant.
I was trying to do some genome binning but it was unsuccessful, so I went back to do some trimmomatic and QC, I found that although the sequence quality improved, my per base sequence content is always a fail, also the per tile sequence quality becomes worse than the untrimmed sequence file.
the code I ran was:
trimmomatic PE -threads 4 -phred 33 \ Raw160823_1.fastq.gz Raw160823_2.fastq.gz \ Raw160823_1_trimmed.fastq.gz Raw160823_1_trimmed_failed.fastq.gz Raw160823_2_trimmed.fastq.gz Raw160823_2_trimmed_failed.fastq.gz \ HEADCROP:10 TRAILING:10 SLIDINGWINDOW:4:30 MINLEN:80
I have tried changing the value of the parameters but the results were about the same. What do you all think? Do I have to make sure all the checks are passed before I proceed to assembly, or do I just care about the per base sequence quality?
Thank you, Kai
What kind of data is this? Plain genomic/amplicons? How you proceed will largely depend on the context. Can you also post the original fastqc plots for the data (pre-any manipulations).
Hi, i have added the fastqc plots (untrimmed and trimmed). The sample was from anaerobes enriched from wastewater treatment plant, and this file is 16S rRNA sequencing.
If this is 16S sequencing just follow the standard workflows like Qiime2.https://docs.qiime2.org/2024.2/tutorials/overview/
will look into that. Thank you very much