Question

Manual Assembly and Protein Translation, HELP, assignment revision

0

Entering edit mode

11 months ago

samRayne ▴ 20

Question 9.1 The following short DNA sequences have all been derived from the same longer contiguous stretch of sequence (or "contig")

TGCATGATGG

ATGCGCTGC

ATGATGGATACCCC

Assemble them into the original contig.

Worked this part out (my alignment for genome assembly)

...............TGCATGATGG

ATGCGCTGC

.......................ATGATGGATACCCC

Result -> ATGCGCTGCATGATGGATACCCC

These additional sequences have been derived from a larger contig that incorporates the original contig.

GGTCGCTTCGCGGCC GCGGCCGCTAATCGGGG

Question 9.2

Assemble them into the larger sequence. (This part I am quite stuck on and is holding me back a little)

GGTCGCTTCGCGGCC

.......................GCGGCCGCTAATCGGGG

I think the alignment is there which results in the sequence: GGTCGCTTCGCGGCCGCTAATCGGGG

This then leads onto the assembly of the larger contig.... which I think.. to suggest to reverse complement it to allow for the binding of the nucleotides through background reading and the nucleotide sequences?

ATGCGCTGCATGATGGATACCCC

..............................................CCCCGATTAGCGGCCGCGAAGCGACC (reverse compliment)

Final sequence : ATGCGCTGCATGATGGATACCCCGATTAGCGGCCGCGAAGCGACC

How does this task differ from the first task?

The task in question 9.1 involves assembling short DNA sequences into the original contig, which is essentially reconstructing the original contiguous stretch of DNA from fragmented sequences. This requires aligning the sequences based on overlapping regions and identifying the correct order to reconstruct the original sequence.

In contrast, question 9.2 involves assembling additional sequences into a larger contig that incorporates the original contig. This task builds upon the first one by introducing additional sequences that extend beyond the boundaries of the original contig. Here, the challenge is to align and integrate these new sequences with the existing contig, considering potential overlaps and ensuring continuity in the final assembled sequence.

Furthermore, in question 9.2, you're considering the possibility of reverse complementation, which adds another layer of complexity. Reverse complementation may be necessary when aligning sequences if they are oriented in the opposite direction to each other. This step ensures proper base pairing and alignment, especially when integrating sequences derived from both strands of DNA.

Overall, while both tasks involve sequence assembly, question 9.2 introduces the complexity of extending the contig with additional sequences and potentially involves reverse complementation to achieve accurate alignment and assembly.

Translate the DNA sequence into a protein sequence

Sequence(3s): ATG CGC TGC ATG ATG GAT ACC CCC GAT TAG CGG CCG CGA AGC GAC C

Codons: ATG CGC TGC ATG ATG GAT ACC CCC GAT TAG CGG CCG CGA AGC GAC C

Amino Acids: M R C M M D T P D STOP R P P R D ..

Reading frame(expasy) : Met R C Met Met D T P D Stop R P R S D - 5' to 3' reading frame 1

What do you find that is notable?

The sequence begins with the start codon "ATG", encoding methionine (M), initiating protein synthesis, and ends with the stop codon "TAG", marking the termination of translation and ensuring proper protein length.

My questions: Is my methodology correct in the way I have approached this? What reason would I use to justify the decision for reverse complementing in the 2nd question (as I initially guessed since some old assemblers used to do this... I think)? What other notable distinctions can be made from the AA sequence?

sequence university assembly protein genomics • 484 views

ADD COMMENT • link updated 11 months ago by Philipp Bayer 8.8k • written 11 months ago by samRayne ▴ 20

score 0 · Answer 1 · 2024-05-13

You asked the same question before, and I told you about the reverse complementing (Contig assembly task, errors ), and now you're back having incorporated my answer into the question without linking to my previous answer (or me). It also switched from an 'amplitude test' to an 'assignment', so it was homework from the start.

This is very strange.