Hi friends
I have used Trimmomatic for checking quality of my RNA-SEQ paired-end files. I have got an odd output, the final result showed different size for fastq file=> L1= 9275244535 and L2= 9238052265
Why this happnen?
I used this code :
java -jar trimmomatic-0.36.jar PE L1.fq.gz L2.fq.gz paired_L1.fastq unpaired_L1.fastq paired_L2.fastq unpaired_L2.fastq LEADING:20 TRAILING:20 MINLEN:140
I did not trim first bases, but first 12 bases showes unbalanced in fastq file and also duplicatation on first 12 bases region.
Could you clarify what f1 and f2 are?
f1 or L1 size in byte, f2 or L2 size in byte. I have updated that.
I don't see the relevance of the size in bytes, number of lines would be more informative (
wc -l yourfile.fastq
)That said, it's very well possible that one read of a pair didn't 'survive' the trimming and the read became 'unpaired'. Edit: which should then end up in different files, thanks to @mastal511 for pointing this out
The lines number for both files were same, both files have 102449352 lines.
Sounds like nothing to worry about then :p