Quality control on 454 data
2
2
Entering edit mode
9.3 years ago

Dear Friends,

I am new to NGS field and after studing alot of literature on qualty control. I use trimmomatic, fastx and qtrim software for 454 RNA-seq data. I am attaching the fastqc output for that. But I am not happy with the per base sequence quality graph, per base GC content and per base sequence content. Any help will be appreciated.

Thanks
Deepak

RNA-Seq • 4.7k views
ADD COMMENT
0
Entering edit mode

What is your quality control pipeline?

ADD REPLY
0
Entering edit mode

I used trimmomatic all default option with headcrop 15. Then used that output as input for fastx tool kit with fastx_trmimmer and trim 8 base from the end of read. Next with fastq quality trimmer option is q 20 and p 30 remove low quality base. Finaaly used qtrim.

ADD REPLY
0
Entering edit mode

You didn't actually attach anything for anyone to look at...

ADD REPLY
0
Entering edit mode

Hi Swan this is my output FastQC report. Please help me how to correct these errors.

ADD REPLY
0
Entering edit mode

Where is the report ? Why don't u just give a Dropbox link ?

ADD REPLY
0
Entering edit mode

Hi this is the link of the file: https://www.dropbox.com/s/h6q97psk3gadf3h/Outputfile_fastqc.html?dl=0

My Pipeline Commands are

java \
  -jar /opt/software/Trimmomatic-0.32/trimmomatic-0.32.jar SE \
  SRR1646514.fastq \
  SRR1646514_filter1.fastq \
  ILLUMINACLIP:/opt/software/Trimmomatic-0.32/adapters/TruSeq2-SE.fa:2:30:10 \
  LEADING:3 \
  TRAILING:3 \
  SLIDINGWINDOW:4:15 \
  MINLEN:36 \
  HEADCROP:15

~/software/fastx_toolkit_0.0.13/fastx_trimmer \
  -Q 33 \
  -m 50 \
  -t 9 \
  -i SRR1646514_trimmomatics_filter.fastq \
  -o SRR1646514_trimmomatics_filter_t_9_m_50.fastq

QTrim_v1_1/QTrim_v1_1 \
  -m 30 \
  -mode 2 \
  -l 50 \
  -out_format 2 \
  -seq_id_stat \
  -plot pdf \
  -fastq ~/Desktop/SRR1646514_trimmomatics_filter.fastq
ADD REPLY
1
Entering edit mode
9.3 years ago

Most trimming software is designed for Illumina data. I suggest you download BBDuk, and use this command:

bbduk.sh in=reads.fq out=trimmed.fq ref=adapters.fa k=23 ktrim=r mink=11 edist=1 qtrim=rl trimq=15

That should provide substantially better output, as it uses optimal dual-ended quality-trimming and allows indels in the adapter sequence, which is important in 454 data.

ADD COMMENT
1
Entering edit mode
9.3 years ago
Maxime B ▴ 10

You can try Prinseq if Trimmomatic doesn't suit you http://edwards.sdsu.edu/cgi-bin/prinseq/prinseq.cgi

For your data it would be something like:

perl prinseq-lite.pl -verbose -fastq SRR1646514.fastq -stats_all -graph_data test.gd -custom_params "AGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTG 1;AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT 1; AGATCGGAAGAGCGGTTCAGCAGGAATGCCGAG 1" -min_len 36 -YourOtherTrimmingOptions

perl prinseq-graphs.pl -i test.gd -html_all -o test
ADD COMMENT

Login before adding your answer.

Traffic: 1547 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6