Hello,
Sorry to ask so many questions, but this is related to a problem I am trying to solve with PAML. I have no prior experience with PAML other than the few test analyses I've run, but I would like to use it to calculate pairwise dn/dS ratios for a large number of (1000+) orthologs or gene fragments. I was hoping I could submit a batch file to PAML consisting of the aligned sequences with the number of sequences and alignment length above each pair of sequences (phylip format). Because each aligned pair would have independent header information, I though PAML would give me an output file with omega values for each pair. But so far I have not been able to batch process these orthologs and it seems I will have to split this massive file into individual files for each alignment.
The question I am left with is: how are dN/dS ratios usually calculated in PAML or other programs by large genome projects that have far too many coding sequences alignments to read in by hand. Once my files are split into individual alignments, I am assuming I'll have to develop some sort of loop, but PAML will require the control file to be updated for each of the 1000 orthologs I wish to analyze. So, I am a bit confused how to analyze such a large amount of data. If anyone had any tips, they'd be very useful.
Thanks,
Zach
2 1011
seq11
CCCCGTCTGCTAAATGGGATGAGGATTGCAGCTGCGTAGCCTGATTTTTGTGGGCGAGATAATACATCAGTGAGCAAAGCCTGGCAACATGGTCCTTAAAAGCCAGGAGATCTTTGCT-GCTCGCGAGACACTCGCAGCACACC-GCAGCGTGAGGAAT--AAAAGCGGGCGTGTGGGACTTTCTAGGAGATTTTTCTTGGCAATGAGTCTTGCTTCAGACGTAAAACCGGTAGCTGTGCCACTGGTAGGGCTCCAGGTCGCTGTATGTAGAGCAACGCATGTCTGATGCTCTGACCTACAGGCAAGAAGGAGGAGAGGAACCAGGAAAACCACGTTTGTCTGTACCTTTGGGCCTGCACAGGGCCACATTGATGG-CAACACAG-ACCCGTGTTTCTGGCAGGGCAGAAGCCATGGGTGGGGATTTGCACGA-CGGGAGCTCCAGGCAGTTTAGGATTTGGCACAGCGATCTCAGAGAGGAGAACTCCTCTCCAAGAAGAGCAGCCCTTTGACAAGGTTACCCCTCATTTCCT-CAGTCGAACTGCCCTTTGCAAGGAGCATC-CCTCAGCTGCCACAGCCTATCGCATCTTCCTCCTGAAATCTGATGCTACCCTGCTTTGATGGTGATCAGTTGTTGACTGCAGAAAAGCTGAATCATGAAACAGAGGTTATAAACAAGGTTATTTTTAACGAAGGAAATAAATCCCACATCACGCAGAAGTC-GGAGCAAGGGACTGTCAGCTAGGGCTCGCAGATCTCACTCCTGCCCTGATATTCTCCATGTCCTGGAGCAAATTACAATCTCTCTAACCTTTAATTTACCTGCTTGTAACATGGACAAAGCAGCAGACTCCAGTTGTGCTTTCTTTGTACAGCAAAGCTCCTGGTGTGACAGTAAAGCGGTGCTGGGAGCCGTGCGATGCCA-GGG--G-CT-GC-CTGGC----G-C-C-TTGCAGAGCTATTTTCAGCCACACAAAATGATCGTACACGGTATTTGAACAATGTTCATAGACGTTTTGAATGCAGAGAGAGACTACCCAGTCTGATTGCC-CTCTCCCGTATTTCACAGTCTTACATTTCAATTCCCACATGCAGCCCTGTAACTTCTGCTTCATTTTGCTCACTGAAAACTCAAAACAGACAAGTTTTGTAGCCCAGAGTCTCACG-TCGTATCCTTCAGGCACTGACGTGCTGCAGGGTGTGTGTTGTAGATATCCAGCCCAGGGGGAGGGAAGATGAGTCTGGTACACCTACAGCTA-ACTGGGAAATTTCCTGATTGTGCCACTTCTTATCTCTCTCTCATACCTAAATTCTTCTCA-GAAAAGCGTATTTCATTTTCCCTGGGAAAGATGCCTCAACTCAGGGAAGGAGGCGATTGAGTCTGGGACTGGCCAGGCTTCCCTGTGCTGCCTGAGGAATACGAACAACCCCACAGAGAGACCCTTGCAAACCCATCTTCTCTGCT-CTACGCCCAAGTTAACAAGTGCTTTGGGTGTGTCT-CC-AACTTTACCCAGATAACGACTAAGGTTAAAGAACTCTAGCTTTTTTGAAA-GTATATTGTAAATAT-CTGTTTTTTTAAATATATATACATAAATGTACACACACACATAAATTTATGAAAACTTCAGGAGACTGGAAACTTTTCCTGCCGTACTTTATTTACAGCCCTGCTAA-G-GCTCTTT-GCTGCTGAAGGAGCTTACCTTCCTCGCTCAAGTTTTCTTTGAGCTAAATAGTGATTTCCACAAAGCAAATATCAAATAAAAACCAAGGCATCGATGGATTGGAT-CCAGAGTGGGTTCTGCTCCCTCTGTCCATTCACTGGTGCAAAGCCCTGGCCCACCTTCACCTGCCTGCTGTGGAAGCAGACACCAAACCTGCACACGCTCGGGACACACAGCAGCATT-TTGGCACAGGTGCCCGGGCACCTCCAGCGC-GGCCGTGAGAAGTGGAACAGGACCAACCTTGCAAGCTAGTTCCGTCTTGCAGGTGCCTGCAAGGACACATGGCTGCGTGCCAGCGGATGGAATAAGCATCGCCCCGTGCTAGTTATGGGGTAGGGATGCGGCAAGGCT--GCGTGAACTAGTGCTTGAGACGCCTCTGTGATCTGTAGATGAAAGGTGATTTACAGCTGCAAATTATTGCTCTTCCT-CATGTGAAGTCTGTTATTTTCTGGTGTCCCTCATTTTGACTCATGCCACGTCCCGTATTTGGCGCGGGGCAGGCTCCTCTCTATG-GGTGGCTGTATAGGTTCTTCACATCTCCAGCTGGCTGCATGCTTTAGAAGCGTGTGATCTGCTCACAGCATCCATTCGAGGCGGCCCGGCAGAGCCCATGCTTGGGTAACTTGGGCTGGGATTTGCACAAAACTGCAGCACTT-AGCAGATTGCTTAGCCGGTGGGTTTAGAGGGCCTGTATTTGAG-GCTGGATACCCAGCTGGGTTGTGCACTCAACTCTGTGAACAGGCAGAGGACCAGCTGGGAGAACCTTAAATCCTGTGGCAGAGAGGAGAAACAGGAATATCCCGAGCCTCTGTGCTTGGGGCTGCCCCAGGGAAGGGGAGGAAGCCAGAGAGCACTAGACTGGGGGCC-CTCAGTCTGCAGCTTCTCATCTGCGTCGCAAACCTCCTGGGCAGGAATTACAAAGCAGGAGAGGTTTGTATCTTTGGCAGGCGTCTGGAGGGAAGGGGTATTAGGGTGGTTTTGTGAACAGCCCATGGACAGCAGAAGGAGCCGGTAGATTCGATGTCCTTCCCGCATTATGCACCATGGCATGCTCACCCCATCAGGCTGCAGAGAGCAAGCGACCTTTTGCTCTTTACCGCTAAATAAAACAGCAGAAAAT-AC-CATGTG-GTATAAGATAGTTATAACGATGGAGGAAGAAATATCCCTGATGCTC
seq12
CCCAGTCTGCTAAACGGCATGAGGATTGCAGCTGTGTAGCCTGATTTTTGCAGGCAACATAATGCAGCAGTTAGCAAACCCTCACAACATGGTCATTAGAAGCCTGGGGAGCTGTGCTGGCT-GTGAGACA--CG--TCACACCAG-AGTGCGAGGAATAAAAAAGTGGGCGTGTGGGACTTTCTAGGAGATTTTTCTTGGCAATGAGTCTTGCTTTAAATGTAGAACTGGTAGCTGTGCCACTGGTAGGGCCCCAGGTCGCCGCATGTAGAGCAGCGCATGTCTGATGCTCTGACCTACAGGCAAGAAGGAGGAGAGGAACCAGGAAAACCACGTTTGTCTGTAGCTTTGGGCCTGCACGGGGCCACATTCATGGTGATGA-GGCACTGGTGTTTCTGGCAGGGCAGAAGCCATGGGTGGGGATTTGCAC-AGCAGGAACTCCAGGTAGTGTAGGATTTGGCGCAGCAATTTCAGACAGGAGAACTCCTCTCCAAAAAGAGCAGCCCTTTGACAAGG-CATCCCTCATTTCCTCCA-TCAAACTGCCCTTTGCAAGGAGCA-CGCCTGAGCTGCAACAGCCTATTGCATCTTCCTCCTGAAATCTGATGCTGCCCTGCTTTGATGGCGATCAGTTGATGACTGCAGAAAAGCTGAATCATGAAAGAGAGGTTATAAACAAGGTTATTTTTAACAAAGGAAATGAATCCCACATCATTCAGAAGTCAGG-GCAAGGGACTGTCAGCTAGGGCTTGCAGGTCCCACTTCTGCCCCGACGTTCTTTATGTCCTGGAGCAAATTAAAATCTCTCTAGCCTTTAATTTACCTCCTTGTAGCATGGACAATGCAGCAGACTCCAGTTGTGCTTTCTTTGTACAGCAAAGCTCCTGGTGTGACAATAAAGCGATGCTGGGAGCTGTGTGATGCCAAGGGCTGGCTTGCACTGCCACCCGGCACCTTGCAGAGCTATTTTCAGCCATGCAAAACGATCGCACGCAGTATTTCAACAATGTTCATAGACATTTCAAATGCAGAGAGAGACTAACCAGTCTGATT-CCTCTCTCCCATATTTCACAGTCTTACATTTCAATTCCCACATGAAGCCCAGTAACTTCTGCTTCATTTTGCTCACTGTACACTCAAAATAGACAAGTTTTGCAACCCAGAGACTCATGCT-GTGTCCTTCAGGCACTGGGGTGCTGCTGA-TGTGA-TA-TAGGTAACCAGCCCAGGGGGAAGGAAGACGGGTCTGGTTTACCTACAGCTACA-TGGGAAATTTCCTGA--GT---A-TACT-A-CTTTCTCGCATT--TAAATTCTTC-CAAGAAAAGCATATTTCATTTTTCCTACAAAAGATGACTCCACTCAGAGAAGCAGGAGATTGAGGCTGGGGCTGGCCAGGCTTCCAAG---------A--AATACGAACAACTCCACATAGAGATCCTTGCAAACCTGTCTTCTCTGCTTC-ATGCCCAAGTTAACAAGTGTTTTGGGTGTGTCTTCCCAACATTACAGAGATAACTACCAAGGTTAAAGAACTCTAGATTTTTTTTTATGT-T-TTGTAAATATTC-GTATTTT-AA-TATATACACATG--TGTA----------T---TT-ACAAAAACTTTAGGAGATTGTAAACTTC-CCTGCCATACTT-ACTCAAAATGCTGTTAAAGAGGACTTTTGCTCCTGGAGGAGCTCACTTTGCTCACTCAAGTTTTCTTTAAG-TAG---G-GACTTCCACACAGAAAATATCAGATAGAACCCAAAGCAGAAATGGGTTGGATGCCA-AGTA--T---G-T---TCTGT---T-C-C-----C----CC-TGGCCCACCTTCACCTGCCTGCTGTGGAAGCAGACACCAAACCTGCACTGGCTTGGGGGACACAGCAG-AGCCTTAGCACAGGTGCCCAGGAACCTCCAGC-CTGGTTGGGAGAAGTGGAACAGGACCAACCCTGCAAGCCAGTTCCCTCTTGCTGGTGCCTGCAAGGACATGTGGCTGTGTGCCAGCAGATGGAATAACCATTGCCCCGTGCTAGTTATGGGGCAGGGATGCTGCACGGCTCTGC-T-AATTAGTGCCTGAGACGCCTCTGTGATCTGTAGATGAAAGGTGATTTACAGCTGCTAATTATTGCTCT-CCTTCATGTGAAGTCTATTATTTCCTGGTGTCCCTCATTTTGACTCATGCCACATCCCGTATTTGGCACAGGGCAGGTTCCTCA-TA-GAGGTGG-TGC-T---TTG--CA--TCTCCAGCTGCCTGCACACTGAACAAGCCTGAGATCTGCTCTCAGCGTCCATTTGAGGCAGCCTGGCTGTGCCCATGCTTGGGAAACTTGGCCTGGAATTTGCATAAAGTTACAACATTTGAGCAGGT-GCTTAGCTAGTGGATTGAGGGGGC-TGCATTC-AGTGCTGGGTACTCAGCTGGGCTGTGCACTCAACTCTGTGA---G-CAGAGGACCAGCTGGGTGAACC--A------G----AGAG-G-AGAAACAGGAATATCCCGAACTTCTCTGCTCGGGACTGCCCCAGGGAAGGGGAAGAAGCCAGAGAGGACCAGACTAGGG-CCTCTTAGTCTATGGCTTCTCATCTGTGTCCCAAACCTCCTGGGCAGGAGTTACAAAGCAGGAGAGGTTCGTGTCTCTGGCAGGTGTCTGGAGGGAAGGGATATTAACGTGGTTTTCTGAACAGCCCATGGATAGCAGAAGGAACAAGTATTTTCAATGTCCTTCCCATGTTATGCCCCATGGCA--C--AT---A-CAGGTTGCAAA-AGCAAGTGACATTTTGCTCTT-ATTGCAAAATAAAAGAGCACAAAGTGACACATG-GCGTA-AGG-TAGTTGTAACGACAGAGGATGAAATATCCCTGATACTC
2 1015
seq21
GTCCCTGTAGCTTATAGCAAAGCATGGCACTGAAGATGCCAAGACGGTTGCCTTC-ATCATACCCAGGGACAAAAGACTTAGTCCTAACCTTACAGTTAATTCTTGCTAAACATATACATGCAAGTATCCGCGCACCAGTGTAAATGCCCTCAATCTCTTGCTTGCAAGACAAAGGAGCGGGTATCAGGCACACCTGTAATTGAACCGTAGCCCAAGACGCCTTGCTTAGCCACACCCCCACGGGTATTCAGCAGTAGTTAACATTAAGCAATAAGTGTAAACTTGACTTAGTTATAGCAACACTCAGGGTCGGTAAATCTTGTGCCAGCCACCGCGGTCACACAAGAGGCCCAAATTAACCGTATACACGGCGTAAAGAGTGGTACCATGCTATCCCATCAACTAGGATCAAAGTGCAACTGAGCTGTCGTAAGCCCAAGATGCATTAAAAGCCACCCTCAAGACGATCTTAGCACCCCCGATCAATTGAACCCCACGAAAGCTGGGACACAAACTGGGATTAGATACCCCACTATGCCCAGCCCTAAATCTTGATGCTTACCCCACTGAAGCATCCGCCTGAGAACTACGAGCACAAACGCTTAAAACTCTAAGGACTTGGCGGTGCCCCAAACCCACCTAGAGGAGCCTGTTCTGTAATCGATAACCCACGATACACCCAACCGTCCCTTGCCACAGCAGCCTACATACCGCCGTCGCCAGCTCACCTCTACCTGAGAGTGCA-A-CAGTGAGCACAATAGCCCTAC-G-C--CGCTAACAAGACAGGTCAAGGTATAGCTCATGGGGCGGAAGAAATGGGCTACATTTTCTAAG-ATAGAAAACACGAAAAGGGGTATGAAACTACCCCTGGAAGGCGGATTTAGCAGTAAAGCGGGACAATAAAGCCCCCTTTAAGTCGGCCCTGGAGCACGTACATACCGCCCGTCACCCTCCTCATAAGCCCCTATTGCTCATAACTAATACACCTACCAGCTGAAGATGAGGTAAGTCGTAACAAGGTAAGTGTACCGGAAGGTGCACTTAGCACACCAAGATGTAGCTAAACGTAAAGCATTCAGCTTACACCTGAAAGATATCTGCC-TCTTACCGGATCATCTTGAAG-CCAACTCTAGCCCAACCATATTACTAATAGAGCACACCA-AAAAAATCCACTCCACC-ACCAAATTAAAACATTTTTTCCACAACTTAGTATAGGCGATAGAAAAGATACTTTGGCGCTATAGAGATATTTGTACCGCAAGGGAAAGATGAAATAACAATGAAAAACTCAAGCAACAAATAGCAAAGATAAGCCCTTGTACCTTTTGCATCATGATTTAGCAAGAACCACCAAGCAAAATGAATTTTAGCTTGCCACCCCGAAACCTGAGCGAGCTACTTACAAGCAGCTATCCTAGAGCGAACCCGTCTCTGTTGCAAAAGAGTGGGAAGACTTGCCAGTAGAGGTGAAAAGCCTACCGAGCCAGGTGATAGCTGGTTGCCTGTGAAACGAATCTAAGTTCCCTCTTAATTTTCCTCTACGGACCCCACCCAACCCCCAACGTAGTGAATCAAGAGCTATTTAAAGGGGGTACAGCCCCTTTAAAGAAGGACACGCCTTCCCTAGCGGATAACTTACCCAACCCCACCCCCTAAACTTGTAGGCCCTTAAGCAGCCATCAGCAAAGAGTGCGTCAAAGCTCCACAC--CCCAAAAATCTGAAGACTGTACGACTCCCTTACCACCAACAGGCCAACCTATAACAATAGAAGGATTAATGCTAAAATAAGTAACTAGGGCCTCTCACCCTCTCAGGCGCAAGCTTACATGATTCCATTATTAACAGGCTAACTAATACCGCAACTTTGACAAGACAAAATATTGAACCCGTC-CTGTTAACCCAACTCAGGAGCGCCCATAAGAAAGATTTAAATCTGCAGAAGGAACTAGGCAAACCCAAGGCCCGACTGTTTACCAAAAACATAGCCTTCAGCCAACCAAGTATTGAAGGTGATGCCTGCCCAGTGACCCCACGTTCAACGGCCGCGGTATCCTAACCGTGCGAAGGTAGCGCAATCAATTGTCCCATAAATCGAGACTTGTATGAATGGCTAAACGAGGTCTTAACTGTCTCCTGTAGATAATCAGTGAAATTGATCTTCCTGTGCAAAAGCAGGAATAGGCACATAAGACGAGAAGACCCTGTGGAACTTAAAAATCAGCGGCCACCACACATTTA-ACTCCTAAGCCTACTAGGCCCGCACACCCCC-TCCAAACACTGGCCCGCATTTTTCGGTTGGGGCGACCTTGGAGAAAAACGAATCCTCCAAAAATAAGACCACACCTCTTAACCAAGAGCAACATCTCAACGTACCAACAGTAACCAGACCCAGCACAAGCCTGACTAATGGACCAAGCTACCCCAGGGATAACAGCGCAATCTCCTTCAAGAGCCCATATCGACGAGGAGGTTTACGACCTCGATGTTGGATCAGGACATCCTAATGGTGCAGCCGCTATTAAGGGTTCGTTTGTTCAACGATTAACAGTCCTACGTGATCTGAGTTCAGACCGGAGCAATCCAGGTCGGTTTCTATCTATGAC-GAACTTTTCCTAGTACGAAAGGACCGGAAAAGTAGAGCCAATACTACAAGCATGCCCTCCCTCTAAGCAGTGAATCCAACTAAACTGCCAAAAGGACACCCACAA-CCCC-TACATCCTAGAAAAGGACCGCTAGCGTGGCAGAGCTCGGCAAATGCAAAAGGCTTAAGCCCTTTACCCA
seq22
GTCCCTGTAGCTTACAGCAAAGCATGGCACTGAAGATGCCAAGACGGTTGTC-TCTATCATACCCAAGGACAAAAGACTTAGTCCTAACCTTACAGTTAATTCTTGCTAGACATATACATGCAAGTATCCGCGCACCAGTGTAAATGCCCTCAATCTCTTGCTTGCAAGACAAAGGAGCGGGCATCAGGCACACCCATGATTAAATCGTAGCCCAAGACGCCTTGCTTAGCCACACCCCCACGGGTATTCAGCAGTAATTAACATTAAGCAATAAGTGTAAACTTGACTTAGTTATAGCAGCCCTTAGGGTCGGTAAATCTTGTGCCAGCCACCGCGGTCACACAAGAGACCCAAATTAACTGTA-ATACGGCGTAAAGAGTGGCATCATGTTATCCCACCAACTAAGATCAAAGTGCAACTGAGCTGTCACAAGCCCAAGATGCATTAAAAACCACCCTCAAGACGGTCTTAGCACTCACGATCGATTGAATCCCACGAAAGCTGGGGCACAAACTGGGATTAGATACCCCACTATGCCCAGCCCTAAATCTTGATGCTTACCCTACTGAAGCATCCGCCTGAGAACTACGAGCACAAACGCTTAAAACTCTAAGGACTTGGCGGTGCCCCAAACCCACCTAGAGGAGCCTGTTCTATAATCGATAACCCACGATACACCCAACCATCCCTTGCCACAGCAGCCTACATACCGCCGTCGCCAGCTCACCTCTACCTGAGA--GCATAGCAGTGAGCGCAATAGCCCAACAGACATCGCTAACAAGACAGGTCAAGGTATAGCCCATGGGACGGAAGAAATGGGCTACATTTTCT-AGAATAGAAAACACGAAAAGGGGTGTGAAACTACCCCTGGAAGGCGGATTTAGCAGTAAAGCGGGACAATAAAGCCCCCTTTAAGTTGGCCCTGGGGCACGTACATACCGCCCGTCACCCTCCTCATAAGCCCCCATTACTTATAACTAATACATTTACAAGCTGAAGATGAGGTAAGTCGTAACAAGGTAAGTGTACCGGAAGGTGCACTTAGCACACCAAGATGTAGCTAAACATAAAGCATTCAGCTTACGCCTGAAAGATATCTACCATC-TATCGGATCATCTTGAAGCCCAACTCTAGCCCGACCATATCAATAA-CGAG-ACA-CACTAAGAAGCTACTCC-CCTACCAGATTAAACCA-TTTTTCCACAACTTAGTATAGGCGATAGAAAGGACACTTTGGCGCGATAGAGATATCTGTACCGCAAGGGAAAAATGAAATAATAATGAAAAACTCAAGCAACAAACAGCAAAGATAAACCCTTGTACCTTTCGCATCATGATTTAGCAAGAACAACCAAGCAAAATGAATTTTAGCCTGCCATCCCGAAACCTGAGCGAGCTACTTACAAGCAGCTACCCCAGAGCGAACCCGTCTCTGTTGCAAAAGAGTGGGAAGACTTGCCAGTAGAGGTGAAAAGCCTACCGAGCCAGGTGATAGCTGGTTGCCTGTGAAATGAATCTAAGTTCCCTCTTAATTTTCCTCTACGGAGCCCACCTAA-CCCCAACGTAGTGAATCAAGAGCTATTTAAAGGGGGTACAGCCCCTTTAAAAAAGGACACACCTCCCCTAGCGGATAA-TTACCCAACCTTACGTCCT-AACTTGTAGGCCCTTAAGCAGCCACCAGCAAAGAGTGCGTCAAAGCTCCACACATCAAAAAAATCTGAAAACCACATGACTCCCTTACCACTAACAGGCCAACCTATAACAATAGGAGAATCAATGCTAGAATAAGTAACTAGGGCCCCTCACCCTCTCAGGCGCAAGCTTACATCATTATATTATTAACAGACCAACTAATACCACAACTTTAACAAGATAGAATATTAAACCC-ACTCTGTTAACCCAACCCAGGAGCGCCCATAAGAAAGATTTAAATCTACAAAAGGAACTAGGCAAACCCAAGGCCCGACTGTTTACCAAAAACATAGCCTTCAGCCAACCAAGTATTGAAGGTGATGCCTGCCCAGTGACCCCACGTTTAACGGCCGCGGTATCCTAACCGTGCGAAGGTAGCGCAATCAATTGTCCCATAAATCGAGACTTGTATGAATGGCTAAACGAGGTCTTAACTGTCTCCTGTAGATAATCAGTGAAATTGATCTTCCTGTGCAAAAGCAGGAATAAACACATAAGACGAGAAGACCCTGTGGAACTTAAAAATCAGCAGCCACCACACAAC-AGACTCCCAAGCCTACCAGGCCCACATACCCCCCTCCAAACACTGGCCTGCATTTTTCGGTTGGGGCGACCTTGGAGAAAAACGAATCCTCCAAAAACAAGACCACACCTCTTAACCAAGAGCAACACCTCGACGTACTAACAGTACCCAGACCCAGCACAAGTCTGACCAATGGACCAAGCTACCCCAGGGATAACAGCGCAATCTCCTTCAAGAGCCCATATCGACAAGGAGGTTTACGACCTCGATGTTGGATCAGGACATCCTAATGGTGCAGCCGCTATTAAGGGTTCGTTTGTTCAACGATTAACAGTCCTACGTGATCTGAGTTCAGACCGGAGCAATCCAGGTCGGTTTCTATCTATGACAGA-CTTTTCCTAGTACGAAAGGACCGGAGAAGTAGGGCCAATGCTGCAGGTACGCCCTCCCCC-AAGCAATGAATCCAACTAAACCGCTAAAAGGACACACATAAACCCCGTACATCCTAGAAAAGGATCGCTAGCGTGGCAGAGCTCGGCAAATGCAAAAGGCTTAAGCCCTTTACCCA