FASTQ file error
0
0
Entering edit mode
8.6 years ago
biolab ★ 1.4k

Dear All,

We are submitting our high throughput sequencing data to NCBI GEO, however, GEO notify us two errors of gzip compressed FASTQ files.

The first error is:

   sample1_R1.fq.gz
   Line number 1922118: File may be truncated

I used perl -ne '$i++; print if $i==1922118' sample1_R1.fq to get that line as follows.

AAAGTTGTTGCAGTTAAAAAGCTCGTAGTTGAACTTCTGTTCAGACTCATAACGACTCGTCGTGTGAAGCTGGACATACGTCTGCAAACTAAAATCGGCA

I can't see it is truncated. What's wrong with this line?

The second error is:

sample1_R2.fq.gz
Line number 1985252: quality length does not match sequence length

I used the above command to get that line, it is as follows.

C@@FFFBBDBFAFIIGGHDFECFAHEHFHGIGADHGEGGH?DF<DF?DB?B?<FFFGHGCHCHFEHIGGFA?B2<?CCDDDDCDCD@>CAC:ACDC:A@A

it is exactly 100 characters that match sequence length, what's the problem?

Would you please to give me some suggestions? I highly appreciate your helps!!

fastq • 2.5k views
ADD COMMENT
1
Entering edit mode

Do both FASTQ files from sample1 have the same number of reads? Also, the quality string you pasted has 110 bases...

ADD REPLY
0
Entering edit mode

Hi, fanli.gcb, thank you for your reply. I realize the two FASTQ files do not have the same number of lines. This is probably the problem. I will further check. THANKS again!

ADD REPLY

Login before adding your answer.

Traffic: 2644 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6