Is there any software out there that can export raw traces from .ab1 or .scf files as coordinates? By this I mean extracting a table of amplitude at each horizontal pixel (raw X-Y data) so that I can reconstruct the trace in Excel or some other graphing program. Thanks in advance for your insight.
As answered by lh3, io_lib is probably the best option. Applied Biosystems are one of those companies that like to deal in proprietary formats that no-one else can read easily.
You may also be able to use abiview from the EMBOSS suite, using the -graph data option:
abiview -graph data -outseq myseq.fa myseq.ab1
When I run this on a sample .ab1 found on my hard drive, it creates 4 ASCII files, named abiviewN.dat, where N is 1-4. They seem to contain raw coordinates (a few sample lines):
Your abiview suggestion is fantastic. The Windows port (mEMBOSS) even lets me do it right at the Windows command prompt. The four files correspond to each of the dye terminators (for A, C, G, and T). Thanks again!
@Gregory: How do we interpret the output from abiview abve? What order are the 4 data files in (GATC or ACGT)? Any assistance or links to resources for this would be useful.
The order is G,A,T,C by default. You can change the selection using the -bases option.
ADD REPLY
• link
updated 4.9 years ago by
Ram
44k
•
written 10.5 years ago by
Gregory
▴
90
0
Entering edit mode
Anyone able to give any more information on the file format of these four files output from abiview? I've searched online and can find nothing. I compared the content of these four files with a trace, but found nothing helpful. Do we even know if the content of these four files is "correct"?
# install the perl modules:
> sudo cpan Bio::Trace::ABIF Array::Transpose
# perl one-liners (which arguably should be in a script):
> perl -MBio::Trace::ABIF -MArray::Transpose -e 'BEGIN{$abif=Bio::Trace::ABIF->new($_); $,="\t",$\="\n"}; $abif->open_abif(@ARGV) or die "can not open"; print @$_ foreach transpose ([map {[$_,$abif->raw_trace($_)]} qw(A T G C)])' my.ab1 > my_ab1_trace.tab
Your solution is considerably more elegant and efficient than running abiview and manually merging the four output files, to be sure. All the more reason to learn perl...
If you want to roll your own .abi file parser, this technical examination of the format by Clark Tibbetts could be helpful, if a little dated (circa 1995):
There are a lot of tools (even free ones) for manually checking traces, you would most likely be better of using one of those than trying to do your analysis in Excel.
Probably the easiest way is to go with the Sequence Scanner Software directly from Applied Biosystems (for Windows, free). It can print, edit, and export your chromatograms.
Thanks for your suggestions. The software that I looked into, including AB's Sequence Scanner Software, exports traces only in graphical form (jpg, pdf, etc). My reason for wanting the raw coordinates actually has little to do with base calling, but to calculate peak areas. I agree that few people would actually want the coordinates, but I'm really hoping that some something out there would offer this feature.
I have the same question as I could not succeed in export raw abi data to excel. My goal is to calculte area under each peak, so any more suggestions are appreciated.
Your abiview suggestion is fantastic. The Windows port (mEMBOSS) even lets me do it right at the Windows command prompt. The four files correspond to each of the dye terminators (for A, C, G, and T). Thanks again!
Good to hear. I thought the files might be A,C,G,T but was a little confused by their content - glad you worked it out.
@Gregory: How do we interpret the output from abiview abve? What order are the 4 data files in (GATC or ACGT)? Any assistance or links to resources for this would be useful.
The order is G,A,T,C by default. You can change the selection using the
-bases
option.Anyone able to give any more information on the file format of these four files output from
abiview
? I've searched online and can find nothing. I compared the content of these four files with a trace, but found nothing helpful. Do we even know if the content of these four files is "correct"?