I am currently analyzing a set of sequences obtained from a 454 pyrosequencing run. The sequences sent for sequencing were amplicons resulting from a substraction experiment aiming to target differentially methylated fragments.
The chimeric reads I am interested in have had some form of adapter-to-adapter ligation concatenate two sequences together, IE:
Adapter-(Seq-1-NNNNNNNN)-Adapter-(Seq-2-NNNNNNN)-Adapter
The tools I've found for identifying chimeric sequences all seem to require some kind of "reference sequences" be supplied, something I do not have. I am therefore looking for a tool which would look for mid-sequence adapters, and split the sequences it identifies accordingly.
Does anyone know of such a tool? If none exist, I will probably code some form of pipeline which BLASTs the adapter sequences and uses the results to split the sequences.
My 454 data is not paired-end, and I do not have access to Roche's software. Sequencing was performed off-facility, and all I received was an sff file along with an .fna and .qual file.
not sure. But 454 paired-end does read in a single read both ends right? Was the 454 data paired-end? Then splitting for that should have been done automaticaly for you by the 454 data provider but you should be able to do it with Roche software most likely.