Question

Details About Fastqc Programs Results (Modules).

1

Entering edit mode

11.3 years ago

Y Tb ▴ 240

Dear All,

I'm using FastQC to check the quality of my RNA-seq data, and as you knows the results consist of 11 modules (Basic statistics ,......................, Kmer Content). I need a detailed tutorial or document containing an explanation of these modules. I have looked at the one on the program website, but it is not detailed enough.

fastqc • 3.8k views

ADD COMMENT • link updated 11.3 years ago by Irsan ★ 7.8k • written 11.3 years ago by Y Tb ▴ 240

1

Entering edit mode

Perhaps it'd be simpler to explain what part of the output you don't understand.

ADD REPLY • link 11.3 years ago by Devon Ryan 105k

0

Entering edit mode

Sequence Length Distribution, Sequence Duplication Levels, Overrepresented sequences, and Kmer Content,

ADD REPLY • link 11.3 years ago by Y Tb ▴ 240

1

Entering edit mode

Sequence length distribution is just the distribution of your reads. If they've been trimmed then there will be more than one length and FastQC might issue a warning, which you can ignore. You can probably ignore the duplication level, that's pretty meaningless in RNAseq. The overrepresented sequences can let you know if you have a bunch of remnant adapter contamination, but will mostly be fragments of rRNAs, which you don't care about. The Kmer content is generally not that useful in RNAseq, since you'd expect anything in an rRNA (even if you deplete things) to show enrichment.

As Michael just mentioned, don't be too concerned with warnings. The main use is to see if you've screwed up adapter/quality trimming or if you had a bubble pass through the flow-cell at some point, causing a transient decrease in quality.

ADD REPLY • link 11.3 years ago by Devon Ryan 105k

0

Entering edit mode

Thanks a lot dpryan79.

ADD REPLY • link 11.3 years ago by Y Tb ▴ 240

1

Entering edit mode

FastQC might just scare you by showing a warning for almost each and every QC-test in RNA-seq, especially the base- and kmer- content by position put me off by showing strong bias at one end. I found some explanation on Seqanswers that this is possibly a bias in the reverse transcription or PCR process which cannot be avoided (yet) and doesn't go away by trimming (unless you chop off the ends which will just conceal the problem). So I tend to ignore those warnings unless the alignment rates decline drastically.

ADD REPLY • link 11.3 years ago by Michael 56k

1

Entering edit mode

Thanks Michael. I think that Per base sequence quality is the important one which shows the quality of sequence.

ADD REPLY • link 11.3 years ago by Y Tb ▴ 240

score 0 · Answer 1 · 2014-04-10

0

Entering edit mode

11.3 years ago

Irsan ★ 7.8k

An explanation of each QC module is clearly explained in the video on the FastQC website. FastQC is designed for DNA-seq. Be considerate with interpretating its results when used for RNA-seq data

ADD COMMENT • link 11.3 years ago by Irsan ★ 7.8k

0

Entering edit mode

Thanks Irsan. Is there any other program that works well with RNA-seq.

ADD REPLY • link 11.3 years ago by Y Tb ▴ 240

0

Entering edit mode

for example https://code.google.com/p/rseqc/. You can also just map the data and look at the percentage of reads that can be mapped uniquely. With STAR it should be around 85 per cent.

ADD REPLY • link 11.3 years ago by Irsan ★ 7.8k