i have a dna.fasta file which contains 6 sequences. i need to write the program should print out the following information; name, description, length, the reverse complement of the sequence
one sequence gene id is CUL3B which needs to translate into the protein sequence on the screen
DNA.FASTA:
>gi|334186320|ref|NM_001203732.1| Arabidopsis thaliana cullin 1 (CUL1) mRNA, complete cds
AAAAGGTAAGAAATTAAATTTGTTTAATATTATACTAATATATTAGTTGACATAAAATAAAAATAATTAA
ATCTATAAGACCTGAAAATAATAGAATAGAAGAGAACGAGTTTGTTTCCTGAGATATCTTTTCACCGCCC
TTCTCATAGCAGCTCTCGCCGATCCTGTGTAATCGGGAATCGGTAATCGTCAGGGTCTGTTTGGGGTTGA
GGAGAGCTTGAAAAGTTGTTTGTCAACATGGAGCGCAAGACTATTGACTTGGAGCAAGGATGGGACTATA
TGCAGACTGGAATTACTAAGCTGAAACGGATTCTGGAAGGATTGAATGAGCCAGCATTTGACTCTGAGCA
ATATATGATGCTTTATACGACTATCTACAACATGTGCACCCAGAAACCTCCTCATGATTACTCACAGCAG
CTTTATGACAAGTATCGGGAAGCATTTGAGGAATATATTAACTCAACAGTTTTGCCTGCGTTGAGGGAGA
AGCATGATGAGTTTATGCTGCGGGAGCTATTTAAAAGATGGTCAAACCATAAAGTAATGGTCAGATGGCT
ATCCCGCTTCTTCTACTACCTTGACCGTTACTTCATTGCTCGGAGATCACTTCCACCACTGAATGAAGTT
GGCCTGACATGCTTCCGTGACCTGGTTTATAATGAGCTACATTCTAAGGTCAAACAAGCTGTAATAGCTC
TTGTTGATAAAGAACGGGAGGGCGAGCAGATTGATAGGGCCCTGCTGAAAAACGTATTAGATATCTATGT
AGAGATTGGAATGGGGCAGATGGAGAGGTATGAAGAAGATTTTGAAAGCTTCATGCTTCAAGATACTTCT
TCGTATTATTCTCGCAAGGCATCAAGCTGGATTCAGGAAGATTCTTGCCCTGATTACATGTTGAAGTCTG
AAGAATGTCTAAAGAAGGAGAGGGAGAGAGTGGCTCACTACCTACACTCAAGCAGTGAGCCAAAGCTGGT
TGAGAAAGTACAACATGAATTGCTGGTTGTGTTTGCAAGTCAGCTTCTAGAAAAAGAACACTCAGGGTGC
CGTGCATTGCTAAGAGATGACAAGGTGGATGATCTCTCCAGGATGTACAGGCTTTACCATAAAATTTTGC
GAGGCTTGGAACCTGTTGCAAACATCTTTAAGCAGCATGTCACAGCAGAGGGTAACGCTCTTGTCCAACA
GGCCGAAGACACGGCTACTAATCAGGTTGCAAATACTGCTAGCGTCCAGGAACAGGTTCTTATCAGAAAA
GTGATTGAACTTCATGATAAATACATGGTATATGTCACCGAGTGTTTCCAGAACCACACCCTCTTCCATA
AGGCTTTGAAAGAGGCATTTGAGATTTTTTGTAACAAAACGGTTGCTGGAAGTTCAAGTGCAGAACTACT
TGCAACATTTTGCGACAATATTCTCAAAAAGGGGGGAAGTGAAAAGCTGAGTGATGAAGCTATCGAAGAT
ACGCTTGAGAAGGTTGTCAAATTGCTTGCATACATAAGTGACAAGGATCTTTTCGCTGAGTTCTACAGGA
AGAAGCTGGCCCGTAGGCTCTTATTTGATCGCAGTGCTAATGATGATCATGAGAGAAGTATCCTGACAAA
GCTCAAGCAACAATGTGGTGGACAGTTTACTTCTAAGATGGAGGGCATGGTGACGGATTTGACACTGGCA
AGAGAAAACCAAAACAGTTTCGAGGATTATCTAGGCAGTAACCCTGCTGCAAACCCAGGGATTGACTTGA
CCGTCACTGTTCTTACCACTGGTTTCTGGCCAAGTTACAAATCATTTGACATAAATCTACCCAGTGAAAT
GATCAAGTGTGTTGAAGTCTTCAAAGGGTTTTATGAAACGAAAACGAAACACAGGAAGCTTACGTGGATC
TATTCACTGGGAACTTGTCACATAAACGGGAAGTTTGATCAAAAGGCCATCGAGTTAATAGTGTCTACTT
ACCAGGCTGCTGTGCTTCTACTCTTTAACACAACTGACAAGTTAAGTTACACTGAGATCTTGGCTCAACT
GAACCTAAGCCATGAAGATCTAGTTAGGTTGCTTCATTCCTTGTCATGTGCTAAGTACAAGATACTCCTT
AAGGAGCCAAACACCAAGACTGTCTCCCAGAATGATGCCTTTGAGTTCAACTCCAAATTCACCGATAGAA
TGCGCAGAATCAAGATCCCTCTTCCCCCAGTTGATGAAAGGAAGAAAGTCGTTGAAGATGTCGATAAAGA
CAGAAGATATGCAATTGATGCTGCCATTGTCAGGATCATGAAGAGCAGGAAAGTATTGGGACATCAACAA
CTTGTTTCTGAGTGTGTTGAGCAACTTAGCCGAATGTTCAAGCCTGATATCAAAGCGATCAAGAAGCGTA
TGGAGGATTTAATAACCAGAGATTATTTGGAGAGGGACAAGGAGAATCCTAACATGTTTAGGTACTTGGC
TTAGGGCAAAAAAACAACAACTATGGAAGTGGTTGGCTCATGAAAGGAATCTGCTTGTATATTTAGAAGT
CCATATGGAGACTGTCCTAAAACAAATTTATCGCTTCATTTTCGCTATTTTTCTCTTTTAAAAAATATTC
GGTCTGTGCTTTTTTTTTGGGATGCAAATTTGCCTTGTGGATTTTTGTTTCTTAAATATTGAATGGAGAT
GGAGAAATGGCCTTAATGAATGAATCTCTGCTTTCTAATATATTTATCCTTGATCATATTATTTAAGTTT
TATGTATCTCTGTGTTATATTGACGGATGGGAAAGTCGTAACAAATAATATGAGATTTCTTAT
>gi|334183952|ref|NM_001198486.1| Arabidopsis thaliana F-box/LRR-repeat protein 5 (SKP2B) mRNA, complete cds
GTCGTCGAATTTGGAATATAATTTGTAATAGTACTGTATTCTCCTGTCAGTTTTAGACACGTGGCAGTTC
ACGTGTCATATAGTCATAACCCGTACGTTTACTCTTTGCCTCTTTCCCTTTTATATTCAAAACTCCTTTT
TGATTTTGTCTATCTTATCTCGTGAATCGTTAATTCGTTATCAAAAGAGCTTAAAAAGCTTTAAAAAATT
AACGGATTAGTAATAATTCAACCGAAGAGAAACCCAAGGCACCGAAGAACACGATTTCAGAGAATCAAAG
AAACCGCTTCAAGGATGGTGAGTGAAGGAGCAACAAGAAAAGAACTTAACCTCTGTTTCGAGAATATGAA
GATGGAAGGAGTTTTGATCTCTGAGTGGAAAGATATCCCTGTGGAGCTTCTCATGAAGATTTTAAACCTT
GTTGATGATCGGACTGTGATCATTGCTTCTTGTATTTGTAGTGGCTGGAGAGATGCTGTTTCCCTTGGCC
TCACTCGCCTCTCCCTCTCTTGGTGCAAGAAGAATATGAACAGTTTGGTTCTATCTCTTGCTCCCAAATT
CGTAAAGCTTCAGACTTTAGTACTGCGACAGGACAAACCGCAGCTTGAGGACAACGCGGTGGAAGCCATA
GCAAATCACTGTCATGAGCTACAAGATTTGGACTTAAGCAAAAGCTCGAAAATCACTGACCATTCCCTAT
ATTCACTTGCTCGTGGTTGTACTAACCTGACTAAACTCAACCTTAGCGGCTGCACTTCGTTCAGCGACAC
TGCTCTTGCGCATTTGACAAGATTTTGCAGGAAGCTCAAAATTCTGAATCTTTGTGGTTGTGTTGAAGCT
GTATCTGACAATACATTGCAGGCTATTGGAGAAAACTGCAATCAGTTGCAGTCACTAAACTTGGGATGGT
GTGAGAATATAAGTGATGATGGAGTTATGAGTTTAGCTTATGGTTGTCCTGATTTAAGAACTCTTGATCT
TTGTAGCTGTGTTCTAATCACAGATGAGAGTGTTGTGGCTTTGGCGAATCGGTGCATTCACTTGAGGTCA
TTGGGGTTATACTACTGCAGAAACATTACAGACAGAGCAATGTACTCTTTAGCTCAGAGCGGAGTCAAGA
ACAAACACGAGATGTGGCGAGCGGTAAAGAAAGGAAAATTCGATGAAGAAGGACTAAGAAGCCTTAACAT
TAGTCAATGCACTTACCTAACACCTTCAGCTGTTCAAGCTGTCTGTGATACATTCCCTGCTCTCCACACT
TGTTCAGGCAGACATTCACTTGTCATGAGCGGTTGTTTGAATCTACAATCTGTTCATTGTGCTTGTATCC
TTCAAGCTCACCGCACTCACACCGTTTACCCTCACCCGGCGCATTGAAACGGTGTGTGAGCCAGAGGGTC
TACTACTCTCTAGTATGTGTGTACATACATATAACCATATGGTGTTAATAAAGCTTCTTTGAGTTCCTTC
TTTGTCTTTGATGCAATCTTAAGATTTTAACATTACCTAGTCTTGAAAATCTTGTAATGAATCGCGAAAT
ACTTATTTCTTCTAACAATTTGTTTAAGTTGCATCCATCAATCAATAATCATATCATTA
>gi|186494183|ref|NM_105635.4| Arabidopsis thaliana cullin 3B (CUL3B) mRNA, complete cds
ATGAGTAATCAGAAGAAGAGAAATTTCCAGATTGAAGCGTTTAAGCAACGAGTCGTCGTTGATCCAAAAT
ACGCCGATAAAACTTGGAAGATCCTTGAACATGCGATTCATGAGATTTACAATCACAACGCTAGTGGTCT
CAGTTTCGAAGAGCTTTACAGAAACGCATACAACATGGTTCTACATAAGTATGGTGATAAGCTTTATACT
GGACTTGTTACCACTATGACATTTCATCTCAAAGAGATATGTAAGTCTATTGAAGAAGCTCAAGGAGGAG
CATTTTTAGAATTGCTTAATAGGAAATGGAATGATCATAACAAAGCGTTGCAAATGATTAGGGATATTCT
CATGTATATGGATCGTACTTACGTTTCTACTACTAAGAAAACTCATGTTCATGAGCTTGGACTTCATCTC
TGGAGAGATAATGTTGTGTATTCGAGTAAGATTCAGACTAGGCTATTGAATACGCTTCTTGATTTAGTTC
ATAAGGAACGGACTGGTGAAGTTATAGATAGGGTGTTGATGAGGAATGTGATTAAGATGTTTATGGATTT
AGGTGAATCTGTTTATCAAGATGATTTTGAGAAGCCGTTTTTGGAAGCTTCTGCTGAGTTTTATAAGGTT
GAGTCAATGGAGTTTATTGAGTCTTGTGATTGTGGTGAGTATTTGAAGAAAGCTGAGAAGCCTTTAGTGG
AAGAAGTCGAAAGGGTTGTGAATTATTTGGATGCTAAGAGCGAAGCGAAGATTACTAGTGTGGTTGAAAG
AGAGATGATTGCTAACCATGTGCAGAGACTAGTTCATATGGAGAATTCAGGTTTGGTTAATATGCTTTTG
AATGATAAGTATGAGGATATGGGTAGAATGTATAGTTTGTTCCGTAGGGTTGCTAATGGTCTTGTAACGG
TTAGAGATGTTATGACTTTGCATCTTAGGGAAATGGGAAAACAGTTGGTTACTGATCCAGAGAAATCAAA
GGATCCGGTTGAATTTGTGCAGCGTCTATTGGATGAGCGGGATAAGTATGACAGAATTATCAACATGGCA
TTTAACAACGATAAGACGTTCCAGAATGCGCTAAATTCTTCGTTTGAGTATTTCGTCAACTTGAACACAC
GTTCTCCTGAGTTTATCTCCCTGTTTGTTGATGATAAGCTACGAAAAGGACTAAAAGGTGTGGGAGAGGA
GGATGTCGATCTTATTCTTGATAAGGTGATGATGTTGTTTCGCTACTTACAGGAGAAAGACGTTTTTGAG
AAATATTACAAACAGCATCTGGCGAAAAGGCTTTTATCTGGAAAAACTGTGTCGGATGATGCAGAGAGGA
ATCTGATAGTGAAACTGAAGACGGAATGTGGGTATCAATTCACTTCGAAACTTGAAGGTATGTTCACTGA
CATGAAGACCTCACACGACACGCTGCTAGGATTTTATAATAGCCATCCCGAGCTTTCAGAAGGACCTACA
CTTGTTGTTCAGGTTCTCACAACCGGTTCTTGGCCCACACAGCCAACCATACAATGTAACCTACCTGCAG
AAGTTTCTGTTCTGTGTGAGAAGTTCAGGTCATATTACCTCGGGACTCACACCGGTAGGAGATTGTCTTG
GCAAACGAATATGGGAACAGCAGATATCAAAGCAGTGTTTGGAAAGGGTCAGAAGCATGAACTAAACGTT
TCGACTTTCCAAATGTGTGTCCTTATGTTGTTCAACAACTCTGATCGACTAAGCTACAAAGAGATCGAAC
AGGCAACTGAAATCCCCACCCCAGACCTAAAGCGTTGCTTGCAGTCAATGGCGTGTGTAAAAGGTAAAAA
CGTGCTAAGAAAAGAACCAATGAGCAAGGAGATAGCAGAGGAGGACTGGTTTGTTGTGAACGACAGGTTC
GCAAGCAAGTTCTACAAAGTGAAGATAGGAACTGTGGTGGCACAAAAGGAGACAGAACCAGAGAAGCAAG
AGACAAGACAGAGAGTAGAAGAAGACAGAAAACCTCAGATCGAAGCAGCCATCGTGAGGATAATGAAGTC
TAGACGAGTGTTGGATCACAACAACATAATCGCAGAGGTCACCAAACAGTTGCAGACGCGGTTCTTGGCA
AACCCAACAGAGATAAAGAAGAGAATTGAATCACTCATTGAGCGTGATTTCTTGGAGAGGGATAATACAG
ACCGGAAACTTTACCGCTATCTAGCGTAAAAAAGTCTGGATTGATTACACGGTCCCTCTGTTTATTTCGC
ATCGTTTCTTCTGTTAGTCAGCATTTCTTATTTGTTCTGTAGTCTGGTAAGTTATAAACATTTTGTTTCC
GTTTTGAAAAGAAAATATTGATTTGCC
>gi|238478497|ref|NM_001160872.1| Arabidopsis thaliana uncharacterized protein (AT1G15860) mRNA, complete cds
GATTATTTCCCCATCTAGAAGCTCTCTCTCGACTCTCTCGTCTGTTTCTATCTTTCGTGGTACCTCTTCT
CTTCTCTTTCCTTTTCTGAGTTTCTGTTAATTTTACTCTCTCTTTTTTTTTTTCTTCTCTGTTGCTTATA
TAATAAGATTTGTCTTTCTTTTCCAAAAACTCGTTTTCTCTAATTCTTCTCTGCGATTCTAATCAAATTC
CGTATAGATGCGTCGCTCTTCATCAAAGAAGAAATCAGGTCAATCAACTGAATCAGTCACCACGGATCTC
TTTCGCTCAGCTTCGAGCAAGGCCTCGAATAAAGAGATGGATCGAATTGATCACTTATTTAATCAGTATG
CCAATAAATCTTCCAGCCTGATTGATCCTGAAGGAATAGAGGAACTATGCTCCAATTTGGAAGTGTCACA
TACTGATATCAGAATCTTGATGCTTGCTTGGAAGATGAAAGCTGAGAAACAAGGTTACTTTACACATGAG
GAGTGGAGAAGAGGCCTCAAGGCTTTAAGAGCTGATACGATCAATAAGTTGAAGAAAGCCCTTCCGGAGC
TTGAGAAAGAGGTCAGGAGGCCATCAAATTTTGCAGATTTCTATGCTTATGCCTTCTGTTATTGTTTAAC
AGAGGAAAAACAGAAGAGCATAGACATAGAGACTATATGTCAACTCCTAGAGATCGTCATGGGATCTACA
TTTCGAGCCCAAGTTGACTACTTTGTTGAGTATTTAAAGATCCAAAACGACTACAAAGTGATAAACATGG
ACCAATGGATGGGACTTTACCGGTTCTGTAACGAGATAAGTTTCCCGGACATGGGGGACTATAATCCAGA
GCTTGCATGGCCATTGATTCTTGACAATTTTGTTGAGTGGATTCAAGAAAAACAAGCCTGAAATCATTTC
TGAGTCCCCTCAAGTCGAAGCTTCAAATCTCTGCAGGATGATCAGTGGGCTCTCTCATCAAACAGATTCA
GCACATTTTTACTTCAGTTTTCATCTTTCAAACATTAAAAAAAGACACATTATATGATTCTTGTTACATG
TGATTAACTTCAATAGAGGGAACACATAATGTTTGATTTATTACATCAAGTTCTGTTAGTAGTAACCAAT
GATTTCGAATTAGCTTGTAAACACGTTGTTACCAAATTTATAACCATCAGATTCATTCTGAA
>gi|238478760|ref|NM_103467.2| Arabidopsis thaliana cullin-like protein (AT1G43140) mRNA, complete cds
ATGGCTACAATCTTGTTCAAGGTCATAATGATGAAGGAGTTAATCCTATTGGAGGAAGGATGGTCTGTCA
TGAAGACTGGTGTTGCAAAGCTACAAAGGATTCTAGAAGATTTGTCTGAGCCACCGTTTGACCCCGGTCA
ATATATCAATCTGTACACGATTATCTACGATATGTGTCTCCAACAACCTCCTAATGATTACTCACAAGAG
CTTTATAATAAGTATCGTGGAGTGGTTGATCATTACAATAAAGAAACTGTTTTGCCGTCTATGAGGGAGA
GGCATGGTGAATATATGCTGAGAGAGCTTGTTAAGAGATGGGCTAACCATAAAATTCTGGTTAGATGGTT
ATCTCGCTTCTGCTTTTATCTTGACCGTTTCTATGTTGCTCGGAGAGGTCTTCCAACACTGAATGATGTT
GGCTTCACATCCTTTCACGACCTAGTTTATCAAGAGATACAGTCCGAGGCCAAAGATGTGCTACTAGCAC
TTATTCATAAAGAACGTGAAGGCGAACAGATTGATAGAACACTAGTGAAAAACGTAATAGATGTCTATTG
TGGGAATGGGGTTGGACAGATGGTAATATACGAAGAGGATTTTGAAAGCTTCTTGCTTCAAGATACTGCA
TCTTACTATTCTCGCAAGGCCTCAAGGTGGAGCCAGGAGGATTCTTGTCCTGATTACATGCTAAAGGCTG
AAGAGTGTCTTAAATTGGAGAAGGAAAGAGTCACTAACTACCTTCATTCTACCACTGAGCCCAAACTAGT
TGAGAAAGTACAAAATGAATTGTTGGTAGTGGTTGCAAAACAGCTTATAGAAAATGAGCACTCTGGGTGC
CTTGCATTGTTAAGAGATGACAAGATGGGTGATCTCTCGAGGATGTACAGGCTTTATCGTCTAATCCCGC
AAGGGTTGGAACCTATTGCAGACTTATTCAAGCAGCATGTTACTGCAGAAGGAAATGCCCTTATCAAACA
AGCCGCCGACGCAGCTACTAATCAAGATGCAAGTGCTAGTCAGGTGCTTGTCAGAAAAGAGATTGAACTA
CACGATAAATACATGGTCTATGTAGATGAGTGTTTTCAGAAACACAGCCTCTTCCATAAGCTATTAAAAG
AGGCGTTTGAAGTCTTCTGTAACAAAACAGTGGCTGGAGCGTCCAGTGCAGAAATACTTGCAACCTATTG
TGATAATATCCTCAAGACCAGAGGTGGAAGTGAGAAGCTTAGTGATGAAGCTACTGAAATTACGCTTGAG
AAAGTAGTTAATTTGCTTGTTTATATAAGTGACAAGGATCTTTTCGCCGAGTTTTACAGGAAGAAACAAG
CTCGTCGGCTCTTATTTGATCGCAGCGGAATCATGAAAGAAGTGACGGATATAACATTGGCAAGAGAACT
CCAAACCAACTTCGTGGATTATTTATCAGCAAACATGACAACAAAGCTGGGGATTGATTTTACTGTCACT
GTTCTTACTACTGGTTTTTGGCCAAGTTACAAAACAACAGACCTTAATCTACCCACTGAAATGGTCAACT
GTGTTGAAGCTTTTAAGGTCTTTTATGGAACAAAAACCAATTCCAGGAGACTTTCATGGATTTATTCTCT
TGGAACTTGTCACATTCTTGGAAAATTCGAGAAAAAAACAATGGAGTTAGTCGTTTCCACGTACCAGGCT
GCTGTGCTTTTGCTCTTCAACAACGCAGAGAGATTAAGCTACACCGAGATTTCAGAGCAGCTAAACCTCA
GCCATGAAGATCTTGTCAGGCTGCTTCATTCACTGTCATGCTTAAAGTACAAGATTCTTATAAAGGAACC
AATGTCGAGAACCATCTCGAAAACCGATACTTTCGAATTCAACTCCAAGTTCACAGATAAGATGCGGAAG
ATTAGGGTGCCTTTGCCTCCAATGGATGAGAGGAAGAAAGTAGTTGAAGATGTTGATAAAGATAGACGCT
ATGCAATAGATGCAGCTCTTGTTCGGATCATGAAGAGTAGAAAAGTGTTGGCGCATCAACAGTTAGTCTC
TGAATGTGTTGAGCATCTTAGCAAAATGTTCAAGCCTGATATAAAGATGATAAAGAAACGGATTGAGGAC
TTGATCAATAGAGATTATTTGGAGAGGGATACAGAAAATGCCAACACTTTCAAGTATGTAGCTTAG
>gi|334182604|ref|NM_001198077.1| Arabidopsis thaliana uncharacterized protein (AT1G15860) mRNA, complete cds
ATATAAGAAAGAAATCAATCGTATATCTTCCAATCAGGTGGCTTCGCCTTTCAGATTATTTCCCCATCTA
GAAGCTCTCTCTCGACTCTCTCGTCTGTTTCTATCTTTCGTGATGCGTCGCTCTTCATCAAAGAAGAAAT
CAGGTCAATCAACTGAATCAGTCACCACGGATCTCTTTCGCTCAGCTTCGAGCAAGGCCTCGAATAAAGA
GATGGATCGAATTGATCACTTATTTAATCAGTATGCCAATAAATCTTCCAGCCTGATTGATCCTGAAGGA
ATAGAGGAACTATGCTCCAATTTGGAAGTGTCACATACTGATATCAGAATCTTGATGCTTGCTTGGAAGA
TGAAAGCTGAGAAACAAGGTTACTTTACACATGAGGAGTGGAGAAGAGGCCTCAAGGCTTTAAGAGCTGA
TACGATCAATAAGTTGAAGAAAGCCCTTCCGGAGCTTGAGAAAGAGGTCAGGAGGCCATCAAATTTTGCA
GATTTCTATGCTTATGCCTTCTGTTATTGTTTAACAGAGGAAAAACAGAAGAGCATAGACATAGAGACTA
TATGTCAACTCCTAGAGATCGTCATGGGATCTACATTTCGAGCCCAAGTTGACTACTTTGTTGAGTATTT
AAAGGTTTGGATCACTCAAAAGTCTCACATTATCCAAAACGACTACAAAGTGATAAACATGGACCAATGG
ATGGGACTTTACCGGTTCTGTAACGAGATAAGTTTCCCGGACATGGGGGACTATAATCCAGAGCTTGCAT
GGCCATTGATTCTTGACAATTTTGTTGAGTGGATTCAAGAAAAACAAGCCTGAAATCATTTCTGAGTCCC
CTCAAGTCGAAGCTTCAAATCTCTGCAGGATGATCAGTGGGCTCTCTCATCAAACAGATTCAGCACATTT
TTACTTCAGTTTTCATCTTTCAAACATTAAAAAAAGACACATTATATGATTCTTGTTACATGTGATTAAC
TTCAATAGAGGGAACACATAATGTTTGATTTATTACATCAAGTTCTGTTAGTAGTAACCAATGATTTCGA
ATTAGCTTGTAAACACGTTGTTACCAAATTTATAACCATCAGATTCATTCTGAA
I have tried the following script to find CUL3B protein sequence:
from Bio.Seq import Seq
from Bio import Seq
fastaFile=open("dna_1.fasta")
for Line in fastaFile:
header = fastaFile.readlines()
if Line.find("CUL3B"):
a= print(Line)
print(a)
Translate= Seq.translate(Line)
print(Translate)
But it gives me the error:
TranslationError: Codon '>GI' is invalid
The main problem is that how can I remove header file by just taking their record? Kindly help me
but no output when I run this script although I have one gene with id CUL3B
Ok. Sorry, the function parses the fasta header to keep only the first "word", so in this case to keep only
gi|186494183|ref|NM_105635.4|
. Therefore, when it checks if contains the gene, it does not because there isn't from this id. That's why you can not print anything.You might want to try the following code:
This code imports and parses a fasta file, if a header contains "CUL3B", it will print it and it will translate the fasta sequence and print it also.
I hope this answers your question.
António