looking for a free command-line based quality score tool for sanger sequencing
1
0
Entering edit mode
7.4 years ago
hfan22 ▴ 40

Dear all,

I got some 16s (forward and reverse) sanger sequencing data (from ABI 3730xl DNA) from our on campus sequencing facility. They came in abi format but the quality scores have not been applied. When I asked for quality scores, the staff at the sequencing facility used KB Basecaller but also warned me that those scores are inflated. The staff recommended Staden, which gives quality score close enough to true Phred score, but it is not command-line based, therefore not good for batch processing. Phred is command-line based but it is not free. It seems like the Phred score calculation is patented (https://www.google.ch/patents/US6681186) and maybe that's why it is hard to find a tool?

Ideally I would like to have the quality score applied to those trace files so I can start the trimming and merging process. I've only worked with next-generation sequencing data and was always given fastq files. Therefore I was also wondering whether it is normal to be provided with trace files without quality score applied.

Any advice is appreciated.

sanger sequencing quality score fastq abi • 3.0k views
ADD COMMENT
1
Entering edit mode

Phred is command-line based but it is not free.

Only if you are a commercial user.

Out of curiosity how many files do you have and is a command line tool must? You may already have access to one of the several commercial programs that can handle .ab1 files (e.g. DNASTAR, Vector NTI, Sequencher etc) via your institution.

ADD REPLY
0
Entering edit mode

Thank you h.mon! Yes it's free for academic use. I should have read the Phred page more carefully (http://www.phrap.org/consed/consed.html#howToGet). Is there anything I could do to minimize future misunderstanding, like editing, or deleting my original post?

I don't have a lot samples, 50-ish, but I prefer not doing things manually.

ADD REPLY
0
Entering edit mode

Please use ADD COMMENT or ADD REPLY to answer to previous reactions, as such this thread remains logically structured and easy to follow. I have now moved your post but as you can see it's not optimal. Adding an answer should only be used for providing a solution to the question asked.

ADD REPLY
0
Entering edit mode

Sorry WouterDeCoster. I will follow the instruction next time.

ADD REPLY
1
Entering edit mode
7.4 years ago
h.mon 35k

You can use Staden at the command-line, with the -nowin flag. Scavenging my old stuff I found this:

pregap4 -nowin -config ~/seqs/pregap.conf -fofn $NAME".files" > $NAME".output"

If my memory is correct, I used pregap once with the Tk GUI, configured as needed and saved the conf. After that, I could run with the above command-line, editing the conf file as needed.

Keep in mind that, while Staden base-calling is reasonable, Phred beats it by a somewhat large margin - and Phred is free for academic use.

edit: besides, I do not think the claim KB Basecaller produces inflated scores is correct, either from my experience or from the literature: A direct comparison of the KB™ Basecaller and phred for identifying the bases from DNA sequencing using chain termination chemistry.

ADD COMMENT

Login before adding your answer.

Traffic: 2152 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6