To understand what the data looks like in AB1 file, you will want to refer to http://www6.appliedbiosystems.com/support/software_community/ABIF_File_Format.pdf
If you are comfortable in R, you might reach for http://bioconductor.org/packages/release/bioc/html/sangerseqR.html and work with data from calling peakAmpMatrix (or traceMatrix)
Or, in perl, then, http://search.cpan.org/~vita/Bio-Trace-ABIF-1.05/lib/Bio/Trace/ABIF.pm
However, the specifics of what you want to compute are not entirely clear to me.
Making reference to the vignette http://bioconductor.org/packages/release/bioc/vignettes/sangerseqR/inst/doc/sangerseq_walkthrough.pdf, given these definitions
P1AM.1 Amplitude of primary basecall peaks.
P2AM.1 (optional) Amplitude of the secondary basecall peaks.
Then "the ratio of the peak value of nucleotide with the strongest signal at one position to the peak value of the nucleotide with the second strongest signal" is simply P1AM.1 / P2AM.1, which can be computed for demo AB1 file as follows:
> x <- read.abif(system.file("extdata", "heterozygous.ab1", package = "sangerseqR"))
> x@data$P1AM.1 / x@data$P2AM.1
Note however that this is the value is ratio of peaks. Not area under the peak. It is usually greater than one but not always. You may want to reconsider exactly what you are trying to determine. Good luck.
Could I get the values for the 4 different bases in form of some parameter at each position ?
I modified my answer to more fully answer your original question. This comment poses a new related but different question. If you get to the point where you think my answer addresses your original question, you will probably have learned enough to answer this new one. Give it a go!
I need to import files from an external drive for which I modified it to
But I am getting an error message.
Any idea how to correct this ? What else do I need to modify in the code?
you are getting quite far afield from you original question now...... try:
I managed to import the file using the command you suggested, but I am unable to access its contents (=different parameters of the ab1 file such as P1AM and P2AM like the ones given here using the
$
operator. I guess it doesn't work for this class.I managed importing different parameters using
x@data$
but when I usex@data$P1AM.1
it gives output 'NULL'. Any reason behind this? Am I using the wrong code?I'm guessing: perhaps the basecaller was not run? Where did you get the file? Have you looked at it in a chromatgram viewer? If not, you should, perhaps with the free FinchTV.
I am using the files that I got from 1stBase sequencing. I have looked at the chromatograms in Geneious and I see signals for nucleotides.
I know this is a fairly old post, but I will try and ask it here anyways
I am looking to do some fairly specific QC by looking into the signal information of an abi file. I have the data read into R using sangerseqR and can access the S4 data just fine. My question is: What does it mean? Specifically, the traceMatrix and DATA.9-DATA.12. The trace matrix is much larger than my sequence, but is the same length as the DATA fields. Is there a standard window for interpreting these values?