Hello community,
I've read different methods on using linux commands such as diff to compare the content between text files, was trying to apply this methods to fastq files. I'm looking to compare the fastq.gz created by using two separate bcl conversion methods, see if they produce the same fastq files.
Steps
- Obtain BCL files from Illumina Sequencer
- Convert BCL files to FASTQ (fastq.gz) format with: a. Method Conversion 1 (bcl2fastq) b. Method Conversion 2 (bcl-convert)
- Determine if files are valid with validfastq
- Compare fastq file content
Objective: Determine if sequence and quality score lengths match between files and that there are no mismatched labels found between files.
Is there any current methods other than diff that con provide this? Should I sort the files first and then compare via diff?
Thank you in advanced.
It would be good to use
Comparing sequence files generated by bcl-convert and bcl2fastq
as title instead ofComparing two files Linux
. Consider editing the title.