Hi,
I am a beginner bioinformatician, I have paired-end fastq.gz files (file1 for R1 and file2 for R2) from illumina. I want to use these files for downstream analysis but I cannot because they don't have the same number of reads. (the difference between read 1 and read 2 is made by "_1", "_2" in the files).
I therefore used repair.sh script from BBMap to delete reads that are only in one of the two files.
repair.sh ran and found reads but could not find pairs (see output). Then the output files are empty.
I tried different parameters but nothing worked...
How can I repair my files easily? Or how can I make repair.sh work?
Thanks!!
At least two things are missing from your post: the exact repair command you used, and some content from your reads.
You may have mismatched files that are not mate pairs. If that is the case, there is nothing that can be done. A simple way of ruling that out is by typing
head read1.fastq
andhead read2.fastq
and showing us the output of those commands.