How To "Assemble" Proteins Using Amino Acids, How Do I Align Fragments Of A Protein To Get The Best Overall Protein?
0
0
Entering edit mode
12.7 years ago
Gimly_Gloin ▴ 70

I've just been attempting a frameshift correction of my DNA sequences using fasty36. the output is a set of protein sequences "hopefully" in the correct frame. I have a huge amount of duplicates, i've tried to reduce them by clustering, which has worked somewhat, but I still have a large number of sequences where the hits to my reference sequence give me different translations for the same read. (My reference; the protein database I used seems to have it's own ambiguities/frame-shifts that change the protein sequence in places, although I could scrutinize this I don't want to, because these may be biologically significant.)

I want to align each read (now that I have many possibilities for the same read) against each other AND choose the best fit for the reference sequences (well over 2000 sequences for the MSA). SO... I was wondering if there is a way to align them end to end, similar to a DNA assembly?

In fasty if I adjust the e-value to be very small, I loose a lot of sequences.

So far i've done:

1) fasty36 run of my DNA-reads against a Protein reference sequence

2) extracted the "corrected" sequences with "/", "\","*","-"...

3) removed those chars for the next step

4) Run a fasta36 of my Translated DNA against my Protein reference sequence using a high e-value to collect the "perfect" alignments(<--- probably an unneeded step)

Thoughts advice, and alternative solutions/suggestions much appreciated....

fasta alignment reference assembly • 4.0k views
ADD COMMENT

Login before adding your answer.

Traffic: 1633 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6