FastQC report: should we not use the fastq data for Tophat+Cuffdiff if any of its modules is marked as "red cross"?
1
0
Entering edit mode
8.4 years ago
tunl ▴ 90

Some of our FastQC reports have a module marked as “red cross” (interpreted as “very unusual”), for example, “Kmer Content”.

We need to run Tophat+Cuffdiff on those fastq data, so I am wondering whether we should not use the fastq data to run Tophat+Cuffdiff if any of its modules is marked as “red cross”?

As for “Kmer Content”, I am wondering how important this module is?

I read the online document on “Kmer Content” (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/3%20Analysis%20Modules/11%20Kmer%20Content.html ), but did not seem to find its significance level.

Is the order of the modules listed the order of their significance (are the top ones more significant than the bottom ones)?

Also, I wonder what tools may be used to fix the problems reported by FastQC?

Any ideas and advice would be greatly appreciated!

Thank you very much!

RNA-Seq FastQC Tophat Cuffdiff • 2.4k views
ADD COMMENT
1
Entering edit mode
8.4 years ago
GenoMax 147k

The red X's come from some interval decisions that Simon had to make when designing the software. These are configurable (there is a file you can edit, fastqc-0.11.3/FastQC/Configuration/limits.txt). Having an X show up does not automatically disqualify a dataset.

For example in an experiment where you expect enrichment of some sequence (which may lead to high duplication etc) you would want to see a red X. So use the FastQC results as a guide for deciding how to handle your datasets that point on and not as a hard pass/fail decision.

Dr. Simon Andrews has several informative blog posts (including FastQC observations) at this new site.

ADD COMMENT

Login before adding your answer.

Traffic: 2649 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6