Producing the Reverse-complement of each sequence in fastq files
2
3
Entering edit mode
7.8 years ago

I want to produce the reverse-complement of each sequence in fastq files. I tried fastx_reverse_complement in FASTX-Toolkit. But following an error message were obtained. fastx_reverse_complement: Invalid quality score value (char '#' ord 35 quality value -29) on line 4

Are there any problems ? Are there any software producing the reverse-complement of each sequence in fastq files?

sequence • 10k views
ADD COMMENT
0
Entering edit mode

I think you can write your script to get the reverse complement using biopython.

ADD REPLY
0
Entering edit mode

Maybe your version of FASTX-Toolkit is too old ?

$ cat toto.fq 
@test
AAACCTGG
+
III#IIEE

$ fastx_reverse_complement -i toto.fq 
@test
CCAGGTTT
+
EEII#III

$ fastx_reverse_complement -h
usage: fastx_reverse_complement [-h] [-r] [-z] [-v] [-i INFILE] [-o OUTFILE]
Part of FASTX Toolkit 0.0.14 by A. Gordon (assafgordon@gmail.com)

Edit: link to the commit that deprecated -Q33.

ADD REPLY
0
Entering edit mode

Can you explain to me why you are after the reverse complement of the FASTQ sequences please? Thanks.

ADD REPLY
6
Entering edit mode
7.8 years ago

You may try seqkit (v0.4.5 or later, run seqkit version to check version), which provides executable binary files for Linux/Windows/OS X. Just download, decompress and immediately use.

$ seqkit seq t.fq.gz 
@K00137:236:H7NLVBBXX:6:1126:29721:23241 1:N:0
TGGTAGGGAGTTGAGTAGCATGGGTATAGTATAGTGTCATGATGCCAGATTTTAAAAAAAATACTGGAGA
+
```eeiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii

$ seqkit seq -r -p  t.fq.gz 
@K00137:236:H7NLVBBXX:6:1126:29721:23241 1:N:0
TCTCCAGTATTTTTTTTAAAATCTGGCATCATGACACTATACTATACCCATGCTACTCAACTCCCTACCA
+
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiee```

All in one:

seqkit seq -r -p t.fq.gz | gzip -c  > new.fq.gz  # faster

or

seqkit seq -r -p t.fq.gz -o new.fq.gz

But, it seems nobody reverses complement FASTQ sequences.

ADD COMMENT
0
Entering edit mode

Best solution, and it worked.

ADD REPLY
0
Entering edit mode

Thanks shenwei356 for your answer! You said, "nobody reverses complement FASTQ sequences," but I'm wondering: do you think it makes sense to reverse complement if you have mate-pair reads in a reverse-forward orientation that you want to feed into an assembler that normally accepts reads in a forward-reverse orientation? That's what is done in https://thegenomefactory.blogspot.com/2012/09/using-velvet-with-mate-pair-sequences.html.

ADD REPLY
0
Entering edit mode
7.8 years ago
theobroma22 ★ 1.2k

The error says you have an invalid quality score so perhaps you can't have negative quality score values such as -29 in the example you provided. Hope this helps.

ADD COMMENT

Login before adding your answer.

Traffic: 1997 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6