Entering edit mode
2.2 years ago
fsaci
•
0
This is my first time using cutadapt. After trimming adapter(AGATCGGAAGAGC) for SE 50bp sequencing reads file, I get a empty fastq file.
My command is :
cutadapt -a AGATCGGAAGAGC -o results/sRNA-seq/cutadapt/SRR8857862_cutadapt.fq.gz fastq/SRR8857862.fastq.gz
The SRR8857862.fastq.gz file like this(I added the space manually to make it easier to view ):
$ zcat fastq/SRR8857862.fastq.gz | head | grep 'AGATCGGAAGAGC'
NACGACTCTCGGCAACGGATATCTCGG AGATCGGAAGAGC ACACGTCTGA
CATCCGTCTGGCTCAGTCTCCGTC AGATCGGAAGAGC ACACGTCTGAACT
NAATGACGGTATCTGAGG AGATCGGAAGAGC ACACGTCTGAACTCCAGTC
And the log infomation like this(I honestly don't understand this log file):
$ cat logs/sRNA-seq/cutadapt/SRR8857862.log
This is cutadapt 4.1 with Python 3.7.12
Command line parameters: -a AGATCGGAAGAGC -o /results/sRNA-seq/cutadapt/SRR8857862_cutadapt.fq.gz /fastq/SRR8857862.fastq.gz
Processing single-end reads on 1 core ...
Finished in 200.24 s (9 µs/read; 6.39 M reads/minute).
=== Summary ===
Total reads processed: 21,331,325
Reads with adapters: 21,308,998 (99.9%)
Reads written (passing filters): 21,331,325 (100.0%)
Total basepairs processed: 1,066,566,250 bp
Total written (filtered): 515,957,093 bp (48.4%)
=== Adapter 1 ===
Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 21308998 times
Minimum overlap: 3
No. of allowed errors:
1-9 bp: 0; 10-13 bp: 1
Bases preceding removed adapters:
A: 18.4%
C: 32.3%
G: 19.3%
T: 29.8%
none/other: 0.1%
Overview of removed sequences
length count expect max.err error counts
3 5 333302.0 0 5
4 1 83325.5 0 1
5 3 20831.4 0 3
7 3 1302.0 0 3
8 285645 325.5 0 285645
9 324846 81.4 0 324846
10 324158 20.3 1 315149 9009
11 317739 5.1 1 308296 9443
12 339889 1.3 1 329897 9992
13 376530 0.3 1 363714 12816
14 616486 0.3 1 596496 19990
15 574232 0.3 1 555528 18704
16 447063 0.3 1 432057 15006
17 500524 0.3 1 483221 17303
18 560586 0.3 1 540318 20268
19 543647 0.3 1 524023 19624
20 567565 0.3 1 545604 21961
21 516139 0.3 1 496017 20122
22 485929 0.3 1 466629 19300
23 590731 0.3 1 566924 23807
24 622773 0.3 1 596992 25781
25 623838 0.3 1 597729 26109
26 1625645 0.3 1 1558914 66731
27 789641 0.3 1 756910 32731
28 1150502 0.3 1 1103410 47092
29 1113779 0.3 1 1068052 45727
30 964253 0.3 1 924660 39593
31 1557766 0.3 1 1495846 61920
32 855010 0.3 1 820559 34451
33 584719 0.3 1 561147 23572
34 1042349 0.3 1 1001130 41219
35 274674 0.3 1 263732 10942
36 162642 0.3 1 156046 6596
37 2205326 0.3 1 2118110 87216
38 124374 0.3 1 118927 5447
39 57080 0.3 1 54734 2346
40 45116 0.3 1 43224 1892
41 30882 0.3 1 29671 1211
42 24274 0.3 1 23263 1011
43 21146 0.3 1 20289 857
44 17075 0.3 1 16332 743
45 5836 0.3 1 5595 241
46 2916 0.3 1 2792 124
47 1738 0.3 1 1656 82
48 1416 0.3 1 1363 53
49 1669 0.3 1 1593 76
50 30838 0.3 1 29419 1419
Thanks in advance!
According to the log file it found the primer in almost all of the sequences and wrote them to the output file, are you sure it's empty?
First of all, thank you very much for your reply.
I found that the file had been rewritten due to a problem with my subsequent analysis.
So I've solved the problem.