Trimmomatic not able to remove low sequence bases
1
0
Entering edit mode
9.7 years ago
David_emir ▴ 500

Hello Friends,

I am dealing with an RNA-Seq Paired end data. My goal is to identify Differential expressed genes. While preprocessing the data I wanted to remove the . I used Trimmomatic command to filters out paired-end reads (PE) whose mean base quality is below 20 (AVGQUAL:20).

First I executed command which removes the Adapter sequence, It did remove the seq.

java -jar trimmomatic-0.33.jar PE -phred64 SRR1600265_1.fastq SRR1600265_2.fastq R1_pairedout.fastq R1_unpairedout.fastq R2_pairedout.fastq R2_unpairedout.fastq ILLUMINACLIP:/home/NSCLC/Trimmomatic-0.33/adapters/TruSeq3-PE-2.fa:2:30:10:1:trueTrimmomaticPE: Started with arguments: -phred64 SRR1600265_1.fastq SRR1600265_2.fastq R1_pairedout.fastq R1_unpairedout.fastq R2_pairedout.fastq R2_unpairedout.fastq ILLUMINACLIP:/home/NSCLC/Trimmomatic-0.33/adapters/TruSeq3-PE-2.fa:2:30:10:1:true

Multiple cores found: Using 16 threads
Using PrefixPair: 'TACACTCTTTCCCTACACGACGCTCTTCCGATCT' and 'GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT'
Using Long Clipping Sequence: 'AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA'
Using Long Clipping Sequence: 'AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC'
Using Long Clipping Sequence: 'GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT'
Using Long Clipping Sequence: 'TACACTCTTTCCCTACACGACGCTCTTCCGATCT'
ILLUMINACLIP: Using 1 prefix pairs, 4 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences

Input Read Pairs: 43254901 Both Surviving: 42693867 (98.70%) Forward Only Surviving: 1817 (0.00%) Reverse Only Surviving: 467152 (1.08%) Dropped: 92065 (0.21%)
TrimmomaticPE: Completed successfully

NOW the real challenge:

I am trying with Trimmomatic command for paired-end reads (PE) trims bases from the 3' end when the base quality is below 20 (TRAILING:20) and filters out reads which are shorter than 50 bases after trimming(MINLEN:50).

# java -jar trimmomatic-0.33.jar PE -phred64 R1_pairedout.fastq R2_pairedout.fastq R1_split_pairedout.fastq R1_split_unpairedout.fastq R2_split_pairedout.fastq R2_split_unpairedout.fastq LEADING:5 TRAILING:5 AVGQUAL:20

TrimmomaticPE: Started with aarguments: -phred64 R1_pairedout.fastq R2_pairedout.fastq R1_split_pairedout.fastq R1_split_unpairedout.fastq R2_split_pairedout.fastq R2_split_unpairedout.fastq LEADING:5 TRAILING:5 AVGQUAL:20

Multiple cores found: Using 16 threads
Input Read Pairs: 42693867 Both Surviving: 0 (0.00%) Forward Only Surviving: 0 (0.00%) Reverse Only Surviving: 0 (0.00%) Dropped: 42693867 (100.00%)

I am not able to understand why trimmomatic not able to trim my reads. Please let me know your opinion.I might sound stupid, but I am Learning things. Thanks for your kind help

P.S: I started with SRA file paired end file --> Split it into reverse and forward bases --> run FASTQC on individual reads --> check for sequence quality for individual reads--> trim/filter for low bases using trimmomatic (for individual file each time.)..Let me know if this flow is correct.

trimmomatic seq RNA • 7.2k views
ADD COMMENT
1
Entering edit mode

"While preprocessing the data i wanted to remove the ."

the adapters ?

ADD REPLY
1
Entering edit mode

Are you sure phre64 is correct? Illumina is not using this anymore and I am not sure if the SRA tools ever used it.

ADD REPLY
3
Entering edit mode
9.7 years ago
rtliu ★ 2.2k

I have checked SRR1600265 fastq file with fastqc, it was Sanger /Illumina 1.9 encoding, you should use parameter -phred33

ADD COMMENT
0
Entering edit mode

How about let the tool [auto] detect phred score? Not sure if that's a correct way.

ADD REPLY
1
Entering edit mode

Auto detecting phred score is usually working, but it is not 100% guaranteed.

ADD REPLY
0
Entering edit mode

Ah. Thanks for sharing this info.
Could you share any instance where auto phred detection in Trimmomatic would fail?

ADD REPLY

Login before adding your answer.

Traffic: 1890 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6