I must design a program capable of extracting only the CDS sections of such a file, which are described in the FEATURES section, to which should be added their corresponding nucleotide sequences described in the ORIGIN section, thus creating a new .txt file with a much simpler structure. The designed program must extract from the original file all portions of the CDS with their description to which it must add, by selective extraction from the ORIGIN section, the corresponding nucleotide sequence, thus creating a new .txt file with, in order, only the descriptions of CDS in which the corresponding nucleotide sequences appear.
EXAMPLE of wath should i have:
CDS 110679..111596
/gene="ENSG00000176695.8"
/protein_id="ENSP00000467301.1"
/note="transcript_id=ENST00000585993.3"
/db_xref="CCDS:CCDS32854"
/db_xref="Uniprot/SWISSPROT:Q8NGA8"
/db_xref="RefSeq_peptide:NP_001005240"
/db_xref="RefSeq_mRNA:NM_001005240"
/db_xref="Uniprot/SPTREMBL:A0A126GWN0"
/db_xref="UCSC:ENST00000585993.3"
/db_xref="EMBL:AB065917"
/db_xref="EMBL:BC136848"
/db_xref="EMBL:BC136867"
/db_xref="EMBL:KP290649"
/db_xref="GO:0004888"
/db_xref="GO:0004930"
/db_xref="GO:0004930"
/db_xref="GO:0004930"
/db_xref="GO:0004984"
/db_xref="GO:0004984"
/db_xref="GO:0005886"
/db_xref="GO:0005886"
/db_xref="GO:0007165"
/db_xref="GO:0007186"
/db_xref="GO:0007186"
/db_xref="GO:0007186"
/db_xref="GO:0007186"
/db_xref="GO:0007608"
/db_xref="GO:0016020"
/db_xref="GO:0016021"
/db_xref="GO:0016021"
/db_xref="GO:0016021"
/db_xref="GO:0050896"
/db_xref="GO:0050911"
/db_xref="HGNC_trans_name:OR4F17-202"
/db_xref="protein_id:AAI36849"
/db_xref="protein_id:AAI36868"
/db_xref="protein_id:ALI87807"
/db_xref="protein_id:BAC06132"
/db_xref="Reactome:R-HSA-162582"
/db_xref="Reactome:R-HSA-372790"
/db_xref="Reactome:R-HSA-381753"
/db_xref="Reactome:R-HSA-388396"
/db_xref="Reactome:R-HSA-418555"
/db_xref="UniParc:UPI0000041E2A"
/translation="MVTEFIFLGLSDSQGLQTFLFMLFFVFYGGIVFGNLLIVITVVS
DSHLHSPMYFLLANLSLIDLSLSSVTAPKMITDFFSQRKVISFKGCLVQIFLLHFFGG
SEMVILIAMGFDRYIAICKPLHYTTIMCGNACVGIMAVAWGIGFLHSVSQLAFAVHLP
FCGPNEVDSFYCDLPRVIKLACTDTYRLDIMVIANSGVLTVCSFVLLIISYTIILMTI
QHRPLDKSSKALSTLTAHITVVLLFFGPCVFIYAWPFPIKSLDKFLAVFYSVITPLLN
PIIYTLRNKDMKTAIRQLRKWDAHSSVKF"
110679.. 111596
ATGGTGACTGAATTCATTTTTCTGGGTCTCTCTGATTCTCAGGGACTCCAGACCTTCCTATTTATGTTGTTTTTTGTATTCTATGGAGGAAT CGTGTTTGGAAACCTTCTTATTGTCATAACAGTGGTATCTGACTCCCACCTTCACTCTCCCATGTACTTCCTGCTAGCCAACCTCTCACTCA
TTGATCTGTCTCTGTCTTCAGTCACAGCCCCCAAGATGATTACTGACTTTTTCAGCCAGCGCAAAGTCATCTCTTTCAAGGGCTGCCTTGT
TCAGATATTTCTCCTTCACTTCTTTGGTGGGAGTGAGATGGTGATCCTCATAGCCATGGGCTTTGACAGATATATAGCAATATGCAAACC CCTACACTACACTACAATTATGTGTGGCAACGCATGTGTCGGCATTATGGCTGTCGCATGGGGAATTGGCTTTCTCCATTCGGTGAGCC
AGTTGGCCTTTGCCGTGCACTTACCCTTCTGTGGTCCCAATGAGGTCGATAGTTTTTATTGTGACCTTCCTAGGGTAATCAAACTTGCCTG TACAGATACCTACAGGCTAGATATTATGGTCATTGCTAACAGTGGTGTGCTCACTGTGTGTTCTTTTGTTCTTCTAATCATCTCATACACT ATCATCCTAATGACCATCCAGCATCGCCCTTTAGATAAGTCGTCCAAAGCTCTGTCCACTTTGACTGCTCACATTACAGTAGTTCTTTTGT TCTTTGGACCATGTGTCTTTATTTATGCCTGGCCATTCCCCATCAAGTCATTAGATAAATTCCTTGCTGTATTTTATTCTGTGATCACCCCT
CTCTTGAACCCAATTATATACACACTGAGGAACAAAGACATGAAGACGGCAATAAGACAGCTGAGAAAATGGGATGCACATTCTAGT TAAAGTTTTAG