Entering edit mode
8.7 years ago
matthew.m.hernandez
▴
30
I am looking for a way to create a consensus sequence based on the occurrence of individual sequences for a given tag. For example, I have the following FASTA sequences in one file for a given tag (i.e. CTAGGCAC). The number in the header refers to their occurrence.
>A_2795
TCAGAAAGAACCTC
>B_10
TCAGAAAGCACCTC
>C_3
TCTGAAAGCACTTC
>Consensus
TCAGAAAGAACCTC
Manually this is not a problem. However, it would be great to be able to make use of perl or bash (including bio packages) to create a consensus based on occurrence of each sequence. I appreciate anyone's thoughts/help.
Thanks!
My apologies for being long overdue, but thanks Pierre! This is exactly what I was looking for.