I am using fastx_clipper tool to remove the adapter sequence from my data file, and I run it as follows:
fastx_clipper -a AGATCGGAAGAGCACACTCTCGTATGCCGTCTTCTGCTTG -Q 33 -i input_file.fastq -o output_file.fastq
and it works for all my file except one of them which gave my the following error
fastx_clipper: Error: invalid quality score data on line 60451008 (quality_tok = "=@<DDB'8;>?B?C?CCBCCC>CC<?8?(2?</50316-?@?(++3>C99AC993+>:9><@8:(:@CC?:>:>AC:8(2:C:>(4?@@5?###"
Then I tired to use sed command to remove the last 4 lines (last read) of this file as follows:
sed '60451005,60451008 d' .fastq > new.fastq
but the output file new.fastq
was too small comparing with the original file (I don't know what happened with the file format), also it was not working with fast_quality-trimmer tool . so I need some one to help me to figure out what is the problem.
Hi Philipp, I am used the above command with 7 other files and it worked well. Also, as you can see above i used -Q 33. so I think there is no problem with this version. Now I need to know to use sed correctly to remove the 4 lines. because when I checked the last line which is the quality score of the last seq I found that the seq length is 100bp, but the quality score length was 74.
Ah OK, of get rid of the last four lines you can also use head:
But looking at your file again, it looks like whatever generated it crashed - did you copy it from somewhere else? Or did another program generate it and crashed while doing so?