ERROR RNA SEQ QC analysis step
1
0
Entering edit mode
10.0 years ago
David_emir ▴ 500

Hello all,

I was trying to do QC with fastx_quality_stats to calculate out nucleotide distribution. The code is as follows

[Ateeq@BIO-DT-415 NEW_NGS]$ /usr/local/bin/fastx_quality_stats -i SRR1604991.fastq -o SRR1604991.txt-Q33

But I got the following Error:

fastx_quality_stats: Invalid quality score value (char '#' ord 35 quality value -29) on line 4

I used -Q33 as well but failed to get any resolution!!! Guys, please help me out, I am so new to this. Thanks a ton ...

A few lines of my FASTQ file for reference:

[Ateeq@BIO-DT-415 NEW_NGS]$ head SRR1604991.fastq
@SRR1604991.1 NRUSCA-WDL30516:178:H9C98ADXX:1:1101:1405:1847 length=200
NTCCTACTCTTCTTAGCGCCTACCCTCATACCTATCTCCCTCCTCCCATCTCCTAGGGGACTGGCGCCAAATGGTCTCTCCCTGCCAATTTTGGTATNTTCCGGGACCAAAATAAAGAGCAAGCAGGCCCCCTTCACTGAGGTGCTGGGTAGGGCTCAGTGCCACATTACTGTGCTTTGAGAAAGAGGAAGGGGATTNGT
+SRR1604991.1 NRUSCA-WDL30516:178:H9C98ADXX:1:1101:1405:1847 length=200
#1=DFFFFHHHHHJJJJJJJJJJJJJJJIJJJJJJJIIJJIJJHIJJJIJGIJIJJJJJIIHHHHFFDDDDDDDCCDDDDDDDDDDDDDDDDDD<C####CCCFFFFFHHHHHJJJJIJJJJJJJJJJJJJJHIJIJIJIII=FGGIIJ=FHIIJEHHH;@DDDDEECEEEE>CDDDDDDDDDDDCCDDDDDDDBDD###
@SRR1604991.2 NRUSCA-WDL30516:178:H9C98ADXX:1:1101:1496:1889 length=200
NTGAGTAGCACTCTCTGAGAGCTCCAATTTCATCCGTCTGCCATCGGCGCCATCCTGCAATCTAAGCCACAATGGTGCGCATGAATGTCCTGGCAGATGCCTCTTTTCGGCATTGTTGATACTCTTGAGAGCATCTGCCAGGACATTCATGCGCACCATTGTGGCTTAGATTGCAGGATGGCGCCGATGGCAGACGGATG
+SRR1604991.2 NRUSCA-WDL30516:178:H9C98ADXX:1:1101:1496:1889 length=200
#4=DDDDFHHHHHHJJJJJJJJJJJJJJJJJJJJJJHHHIIJJJJJJIGIJIIIJHHHHHGFDFFFFEEE;65;>;@CDDDDDDDEDDEEDDDDDDDDDDCCCFFFFFHHHHHJJJJJJJJJIJJJJJJJJJJJJHIIJJJJJJJJJJJIIHIJGIJIJJJEHGHHFFFFFFEEEEEDDDDDDDDDDDDDDDDDDDDDDD
@SRR1604991.3 NRUSCA-WDL30516:178:H9C98ADXX:1:1101:1498:1946 length=200
GCCACCACGGAGCGAGAAGCCCAGATAGACGCCCCGGCGGCCCCGGGTCCTGGAGTCCCGCCGCCTGCTGCCCGGCCGAGGACCCCACCCCGCCTGCCGCCCTGTCCATGGCGGGCCCCACTGCAAGCATCGGGCGGCAGGCGGGGTGGGGTCCTCGGCCGGGCAGCAGGCGGCGGGACTCCAGGACCCGGGGCCGCCGG
[Ateeq@BIO-DT-415 NEW_NGS]$ /usr/local/bin/fastx_quality_stats -i SRR1604991.fastq -o SRR1604991.txt-Q33
fastx_quality_stats: Invalid quality score value (char '#' ord 35 quality value -29) on line 4

-Ateeq Khaliq

RNA-Seq • 2.5k views
ADD COMMENT
1
Entering edit mode

if you specify -Q33 it should work, since "#" is an accepted symbol for the phred33, as you can see from this page.

I tried with your example and for me it works fine.

I initially thought it was a typo, but did you check there's a space before -Q33 in your line?

ADD REPLY
0
Entering edit mode

I have no clue why its giving error :( its showing for 'J' also. please have a look :(

[Ateeq@BIO-DT-415 NEW_NGS]$ /usr/local/bin/fastx_quality_stats -i SRR1604991.fastq -o SRR1604991.txt -Q33
fastx_quality_stats: Invalid quality score value (char 'J' ord 74 quality value 41) on line 4

Edit: I checked, space is making no difference :(

ADD REPLY
2
Entering edit mode
10.0 years ago
Martombo ★ 3.1k

ok then apparently it is using the old phred33 scores (sanger), that were defined only until 40, which is "I". while with the new illumina scheme a "J" is also possible. as I told you I tried running your command with your example reads and it's giving me no error. maybe you're using an old version of fastx? mine is 0.0.13.2. what's yours?

alternatively you could use another tool, like fastqc, which is giving a similar output

ADD COMMENT
0
Entering edit mode

Thanks a lot, I resolved this, Its an version error !!! Thanks again, this helped me lot !!!

ADD REPLY

Login before adding your answer.

Traffic: 1309 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6