Entering edit mode
11.3 years ago
lgbi
▴
150
After converting a 454 SFF file to FASTQ, I get sequences like this:
tcagAGTACGCTATGTGAATCATCGAATCTTTGAACGCACATTGCGCCCTCTGGTATTCCGGGGGGCATGCCTGTTCGAGCGTCATTATAACCACTCAAGCTCTCGCTTGGTATTGGGGCTCGCGGTTTCGCGGCTCCTAAAATCAGTGGCGGTGCCTATCGGCTCTACGCGTAGTAATACTCCTCGCGATTGAGTCCGGTAGGTCTACTTGCCAGCAACCCCTAATTTTTTTAAGGTTGACCTCGGATCAGGTAGGGATACCCGCTGAACTTAAGCATATCAATAAGCGGAGGactgagactgccaaggcacacaggggataggnn
Key sequence, barcode, forward primer are all OK, but the last nucleotide of (the reverse complement of) the reverse primer is always lower case: GCATATCAATAAGCGGAGGa
.
Because of this, it gets deleted when I do a trimmed conversion from SFF to FASTQ, and the sequence gets lost when my software looks for primers and barcodes. This has been happening for several 454 runs the last couple of months.
I can of course easily work around this, but still I would like to know what is causing this. Anybody has an idea?
what tool do you use to convert the SFF to FASTA?
Very good question - if the conversion tool had an off by one error, that would explain the 'extra' base being clipped.
I'm using the sffinfo tool provided by Roche.
Have you asked your sequence provider if they know about this issue?