Entering edit mode
5.6 years ago
m.t.lorenc
•
0
Hi,
I downloaded RNA-Seq data from NCBI and used fastq-dump --gzip --skip-technical --readids --read-filter pass --dumpbase --split-3 --clip SRR1043177.sra
to convert into FASTQ. However, it appears that header in @
and +
are different:
> zcat SRR1043177_pass_1.fastq.gz | head
@HWI-ST960:133:C1FJJACXX:6:1101:1708:2209/1
TNAAACTTAAAGGAAAAACATGGAATTTGTTTCTATGTTCTGCTTATTTGCGATTGTTTCTTTCTCTCTTCNNNNNNNNNATTCNNNNTNCNNNNTCNTG
+SRR1043177.1.1 HWI-ST960:133:C1FJJACXX:6:1101:1708:2209 length=100
@#1=DDFFHHHHGIIJIJIJJHIIFIIJJCGIJJJJJIIDDGGHIJGIIJFHFBGHFHHFHGIGIGHGJJC#############################
This caused an error in trim_galore
trim_galore -o /scratch/waterhouse_team/tmp/galore --cores 2 --paired NbSRR_WtR1.fastq.gz NbSRR_WtR2.fastq.gz
...
cutadapt: error: Error in FASTQ file at line 3: Sequence descriptions don't match ('HWI-ST960:133:C1FJJACXX:6:1101:1708:2209/1' != 'SRR1043177.1.1 HWI-ST960:133:C1FJJACXX:6:1101:1708:2209 length=100').
The second sequence description must be either empty or equal to the first description.
How it possible to fix the FASTQ file?
Thank you in advance,