I have Illumina 2000 paired-end sequencing data. I did quality trimming with fast QC and then remove the adapter sequences (Illumina paired-end adapters) with cutadapt. From the results, I found that only a few reads have adapters. I then check it with trim galore which shows only 0.1% of the reads containing adapter sequences.
I am wondering why only 0.1 % of the sequences containing the adapter sequences.
cutadapt
=== Summary ===
Total read pairs processed: 30,981,418
Read 1 with adapter: 3,821 (0.0%)
Read 2 with adapter: 2,104 (0.0%)
Pairs that were too short: 434,082 (1.4%)
Pairs written (passing filters): 30,547,336 (98.6%)
Total basepairs processed: 15,490,709,000 bp
Read 1: 7,745,354,500 bp
Read 2: 7,745,354,500 bp
Quality-trimmed: 466,256,549 bp (3.0%)
Read 1: 85,966,692 bp
Read 2: 380,289,857 bp
Total written (filtered): 14,923,182,261 bp (96.3%)
Read 1: 7,561,684,003 bp
Read 2: 7,361,498,258 bp
the result summary of trim_galore
Trim galore
=== Summary ===
Total reads processed: 30,981,418
Reads with adapters: 38,498 (0.1%)
Reads written (passing filters): 30,981,418 (100.0%)
Total basepairs processed: 7,745,354,500 bp
Quality-trimmed: 85,966,692 bp (1.1%)
Total written (filtered): 7,659,104,092 bp (98.9%)
Did I use wrong adapter sequences or the adapters have already been removed after the sequencing?
Every sequence does not need to have an adapter. In fact you only see adapters in reads that have inserts that are smaller than the number of cycles of sequencing carried out.
Your data may be fine as is.
Do you have adapters in the overrepresented sequences of the FASTQC report ?
I found some of the over-represented sequence but they do not have the paired-end adapter sequence
file:///home/tajammul/PhD_data/Radula_moss/clipped/a2_plus_b2_ATTCCT_L001_R2_001.trimmed_fastqc.html#M9
file:///home/tajammul/PhD_data/Radula_moss/clipped/a2_plus_b2_ATTCCT_L001_R1_001.trimmed_fastqc.html#M9
Those kind of links are not going to work since they point to some file on your local desktop.
Your best bet is to take a screenshot of what you want to show and then upload it to one of the free image hosting sites (you can find them once you press
Ctrl+G
in biostars message edit window.So unless you used home-made adapters, your data should be clean. FASTQC automatically detect 'classic' adapters in the overrepresented sequences.
http://tinypic.com/view.php?pic=2ngucrc&s=9#.V-KX3tHQPCI