I'm wondering if anyone knows how I can get the trace data for each base, formatted into a text file. I've tried EMBOSS abiview and Biopython to retrieve these data. EMBOSS generates four file, such as suggested here, but the values don't seem to correlate to the chromatogram and trace values such as in the image. Biopython works to retrieve the data as dictionaries, but the in the dictionary abif_raw
, the keys DATA9-12 are missing.
Chromatogram example with trace values:
This is great, @trausch. I was able to create the outputs, and it appears the .tsv file is what I'm looking for. I tested with one .ab1 file, which has a length of 434 bp, however there are 10,495 lines in the .tsv file.
Here's the first lines of the .tsv:
If I use GUI software to compare the calls per position, I can see that multiple lines in the .tsv belong to position one in the chromatogram view.
If you look at the basenum column it should go from 1 to 434. The trace is longer, all called peaks have a basenum value. In the GUI the x-ticks correspond to the basecall positions.