Entering edit mode
5.9 years ago
sunnykevin97
▴
990
Hi I have CDS sequences for the genome datasets.
How to convert fasta to a .nuc format file (used for a running PAML) ?
fasta format
>chr9:94400443-94400608
CTGCTGCCCCACTTGATGCCCAGCTCCTGGCCATAGTTGTCCCCATACGAGACCAGCAGCTCACAGCCAGGCTGGATGGTGCGGCAAGTCCAGTAGAAGATCTGCCTGTGGTACTGAAAGGCCACTAGGTTCTGCTCCTCCTTGTCCCGGGTGCAGTTCACATAC
>chr9:94401663-94401731
CTCATCTGGTTGGCCCAAGGCTCGTCCTTTCCATCCACATACTTATAGCAGTTCCTCCTCTTGGTGAT
>chr9:94402051-94402320
CAGCCAGGAGTAGCCACTGTTGGCTGCCTCGTCTTCTGTCACCTGGCCCTTATAGGGGCCAAAGTGCAGACCCAGTAGCTGCTCGGACACCTCATACCATACTCTGAGCCCAGCCTCAGGGATGCTGGATGGCCCAATTCTCAGTCCAGGGGGCAGAGTGAGGGCTGAGTGATTGGGGTGCCCCCTGTCTACAGGGCTGTCCTTTACAAACATTGGGGCCCTATGAACTGCACAGGTGTCGATGAAAAAGTTCTGGCACTTCTCACAAT
>chr9:94402490-94402563
AAAGGTAGTCATCATTCTGGGGCTCACCGACCTCCTGGTACTCGAGGCCCTTTCTTTCTCGCAGGCTGTACAT
nuc format
head brown.nuc
5 895
Human
AAGCTTCACCGGCGCAGTCATTCTCATAATCGCCCACGGACTTACATCCTCATTACTATT
CTGCCTAGCAAACTCAAACTACGAACGCACTCACAGTCGCATCATAATCCTCTCTCAAGG
ACTTCAAACTCTACTCCCACTAATAGCTTTTTGATGACTTCTAGCAAGCCTCGCTAACCT
Monkey
CGCCTTACCCCCCACTATTAACCTACTGGGAGAACTCTCTGTGCTAGTAACCACGTTCTC
CTGATCAAATATCACTCTCCTACTTACAGGACTCAACATACTAGTCACAGCCCTATACTC
CCTCTACATATTTACCACAACACAATGGGGCTCACTCACCCACCACATTAACAACATAAA
gorilla
ACCCTCATTCACACGAGAAAACACCCTCATGTTCATACACCTATCCCCCATTCTCCTCCT
ATCCCTCAACCCCGACATCATTACCGGGTTTTCCTCTTGTAAATATAGTTTAACCAAAAC
What have you attempted so far?
From aligned bam files, I called variants. Based on the reference genome annotation for each genome I modified the fasta sequences and generated CDS for all genomes (~18) what would be the next step ?