Details About Fastqc Programs Results (Modules).
1
1
Entering edit mode
10.6 years ago
Y Tb ▴ 230

Dear All,

I'm using FastQC to check the quality of my RNA-seq data, and as you knows the results consist of 11 modules (Basic statistics ,......................, Kmer Content). I need a detailed tutorial or document containing an explanation of these modules. I have looked at the one on the program website, but it is not detailed enough.

fastqc • 3.4k views
ADD COMMENT
1
Entering edit mode

Perhaps it'd be simpler to explain what part of the output you don't understand.

ADD REPLY
0
Entering edit mode

Sequence Length Distribution, Sequence Duplication Levels, Overrepresented sequences, and Kmer Content,

ADD REPLY
1
Entering edit mode

Sequence length distribution is just the distribution of your reads. If they've been trimmed then there will be more than one length and FastQC might issue a warning, which you can ignore. You can probably ignore the duplication level, that's pretty meaningless in RNAseq. The overrepresented sequences can let you know if you have a bunch of remnant adapter contamination, but will mostly be fragments of rRNAs, which you don't care about. The Kmer content is generally not that useful in RNAseq, since you'd expect anything in an rRNA (even if you deplete things) to show enrichment.

As Michael just mentioned, don't be too concerned with warnings. The main use is to see if you've screwed up adapter/quality trimming or if you had a bubble pass through the flow-cell at some point, causing a transient decrease in quality.

ADD REPLY
0
Entering edit mode

Thanks a lot dpryan79.

ADD REPLY
1
Entering edit mode

FastQC might just scare you by showing a warning for almost each and every QC-test in RNA-seq, especially the base- and kmer- content by position put me off by showing strong bias at one end. I found some explanation on Seqanswers that this is possibly a bias in the reverse transcription or PCR process which cannot be avoided (yet) and doesn't go away by trimming (unless you chop off the ends which will just conceal the problem). So I tend to ignore those warnings unless the alignment rates decline drastically.

ADD REPLY
1
Entering edit mode

Thanks Michael. I think that Per base sequence quality is the important one which shows the quality of sequence.

ADD REPLY
0
Entering edit mode
10.6 years ago
Irsan ★ 7.8k

An explanation of each QC module is clearly explained in the video on the FastQC website. FastQC is designed for DNA-seq. Be considerate with interpretating its results when used for RNA-seq data

ADD COMMENT
0
Entering edit mode

Thanks Irsan. Is there any other program that works well with RNA-seq.

ADD REPLY
0
Entering edit mode

for example https://code.google.com/p/rseqc/. You can also just map the data and look at the percentage of reads that can be mapped uniquely. With STAR it should be around 85 per cent.

ADD REPLY

Login before adding your answer.

Traffic: 2132 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6