ChIP-seq reads trimming
1
0
Entering edit mode
7 weeks ago
SEJAL • 0
ERROR: sequence and quality have different length:
@SRR16684719.6426858 6426858/1
CCCCTTCCTTTCTTTTTTGAGTTGGAGTTTCACTCTTGTTGCCCAGTCTGA
+
FFFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF

I am trimming the fastq files with fastp, and it is showing this error, the file size after trimming is also very less as compared to the raw untrimmed fastq files. How can this be corrected? I have also tried aligning the files (using bowtie2) without trimming and it shows up the same error in the alignment output.

trimming chipseq fastp alignment bowtie2 • 467 views
ADD COMMENT
0
Entering edit mode
7 weeks ago
GenoMax 148k

Looks like your data file is corrupt. Either download a new copy if you can or you will need to use something to remove these reads.

ADD COMMENT
0
Entering edit mode

I have tried downloading the file again and running the alignment, it shows the same error. Also, it is for many fastq files. Is there a tool to remove such reads from fastq files?

ADD REPLY
0
Entering edit mode

You can use one of the tools mentioned in this thread: is there a tool to recover corrupted fastq files

If you have a large number of fastq records getting removed then you really should be very careful. Removing a large number of sequences could make the data invalid.

ADD REPLY

Login before adding your answer.

Traffic: 1996 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6