Question

read quality of sequence by fastqc

0

Entering edit mode

8.4 years ago

nora ▴ 40

hello, please help me when i tried to read quality of sequence by fastqc in galaxy interface i received a message that says: No known encodings with chars < 33 ( the fasta file was downlaoded from genbank and uniprot) thank you in advance

software error • 2.6k views

ADD COMMENT • link updated 8.4 years ago by Ido Tamir 5.2k • written 8.4 years ago by nora ▴ 40

0

Entering edit mode

fasta files don't have quality lines. Can you paste an example of the input file?

ADD REPLY • link 8.4 years ago by Asaf 10k

0

Entering edit mode

well, http://www.ncbi.nlm.nih.gov/genome/?term=Lactobacillus%20fermentum%5BOrganism%5D&cmd=DetailsSearch the file was donlaoaded via this website

ADD REPLY • link 8.4 years ago by nora ▴ 40

0

Entering edit mode

On this page there are no fastq format files (hence you can't use FastQC, as indicated by @Asaf). If you are actually using a fastq formatted file then provide a direct link for it.

ADD REPLY • link 8.4 years ago by GenoMax 148k

0

Entering edit mode

i converted the fasta file to fastq (tabular lines were written as FASTQ reads)

ADD REPLY • link 8.4 years ago by nora ▴ 40

0

Entering edit mode

See convert FASTA into FASTQ using linux for information about why converting FASTA to FASTQ will not give meaningful quality scores (which is most of what FastQC operates on). Additionally, most of the FastQC metrics are only useful if you have more than one read, or sequence, in your file. It sounds like you have one FASTA file and want some information about its nucleotide content maybe?

ADD REPLY • link 8.4 years ago by Matt Shirley 10k

0

Entering edit mode

I wanted to know what the meaning of this message No known encodings with chars < 33 because when I used the full file I have not got a result but when I used just a part of the sequence the fastqc tool works

ADD REPLY • link 8.4 years ago by nora ▴ 40

0

Entering edit mode

Did you see the link to Wikipedia article on FastQ format that @Ido provided in his answer below?

BTW: How did you convert the fasta to fastq (did you use your own code)? Tools I know of (reformat.sh from BBMap) generally set all Q-scores to a fixed fake value for all bases.

ADD REPLY • link 8.4 years ago by GenoMax 148k

0

Entering edit mode

i converted fasta file to tabular to fastq in galaxy interface

ADD REPLY • link 8.4 years ago by nora ▴ 40

score 1 · Answer 1 · 2016-08-01

The galaxy Q&A is https://biostar.usegalaxy.org/
check if its really a fastq file by looking at it and compare it to https://en.wikipedia.org/wiki/FASTQ_format (maybe its fasta?)
Unfortunately for beginners you might have to set the correct fastq format for some tools. I think its enough to get to "edit attributes" (oencil symbol) "datatype" and then I think its fastqsanger. The is also the fastq groomer tool but I think its not necessary. And fastqc normally copes with all format, maybe its really not fastq