Entering edit mode
8.7 years ago
nikelle.petrillo
▴
110
Hello everyone,
I have run my reads through trimmomatic. I would like to know the number of reads in my raw reads and then the number of reads that were left after quality trimming/filtering in trimmomatic. I have generated "trimlogs" but these seem to be of no help as I do not quite understand them. Does anybody know how (or the code/command i can use) to go about counting the number of raw reads and then counting the number of reads left after filtering?
Thanks! Nikelle
Trimmomatic should be anything like that of the cutadpat where you are using a custom tool to remove the adapter sequences. When you received your raw reads you must have some Fastq.gz raw files with FastQC report files. You can check there for the number of reads for the raw sequences. Once you have the output of timmomatic you can run again the FASTQC , genomax2 is correct in saying that, however for pre/post fastq or fastq.gz you can always do this
if compressed
Thank you!
Is it correct if I run this command:
[npetrill@trogdor AMMA_transcripts]$ awk '{s++}END{print s/4}' R1V29_B1F9St_CGATGT_L001_R1_001.fastq
31118843
On a fastq file that looks like this?
Will that command give me the correct amount of reads?
Looks like it did give you the number of reads:
31118843
Confirm with the command @vchris_ngs gave in the post above yours.
Great, that works. Thank you both!
do both of the command gives you same read output?
31118843
They do, thanks so so much!
If you have FastQC installed then run it on pre-/post- set of files.
There would be
grep
(and other ways of doing this) but we would need to know what machine identifier your read headers have to make it fool proof.Or this: How to count fastq reads
Thanks genomax,
I have the fast X toolkit installed, could I use this to count reads?