Adding Domains To Cds In Genbank Format
1
0
Entering edit mode
11.9 years ago
Lee Katz ★ 3.2k

Hi, I am wondering what the correct way is for adding protein domains to CDS entries in a Genbank file.

InterProScan annotates a CDS and tells me where each protein domain might be (or it might annotate the whole gene, but that's not an issue for me). If I have a CDS from coordinate 330 to coordinate 1178, and a domain is found at 342..1170, and a second domain is found at 348..1164, then how is this shown in the Genbank file? And even easier, is there a way to simply do it with BioPerl?

I am currently doing it like such, but when I load it into the Apollo genome viewer which is my benchmark for correctness, it doesn't look exactly right. It just groups everything into one misc_feature in the interface, with all features combined.

Thank you for your help!

LOCUS       NODE_80_length_3830_cov_32.131855         3952 bp    dna     linear   UNK
ACCESSION   unknown
FEATURES             Location/Qualifiers
     source          1..3952
                     /mol_type="genomic DNA"
                     /project="K5661"
                     /organism="XXXXXX"
     gene            330..1178
                     /locus_tag="K5661_draft_3226"
     CDS             330..1178
                     /locus_tag="K5661_draft_3226"
                     /product="Sulfate-binding protein sbp"
     misc_feature    342..1170
                     /locus_tag="K5661_draft_3226"
                     /evalue="1.2e-71"
                     /database_name="SUPERFAMILY"
                     /status="T"
                     /evidence=superfamily
                     /product="Sulfate-binding protein sbp"
                     /product="Periplasmic binding protein-like II"
                     /accession_num="SSF53850"
     misc_feature    348..1164
                     /locus_tag="K5661_draft_3226"
                     /evalue="1.9e-131"
                     /database_name="TIGRFAMs"
                     /status="T"
                     /evidence=HMMTigr
                     /product="Sulfate-binding protein sbp"
                     /product="3a0106s03: sulfate ABC transporter,
                     sulfate-bindin"
                     /accession_num="TIGR00971"

[etc, and ORIGIN with the sequence is correctly shown at the end]
genbank cds • 2.7k views
ADD COMMENT
0
Entering edit mode

Ok... no help on this exact question yet. What about any help on finding documentation for sub features in a genbank file? I cannot understand from the basic genbank documentation on how to add sub features. Somehow GFF3 can do it but not Genbank--doesn't make sense.

ADD REPLY
0
Entering edit mode
11.9 years ago
Lee Katz ★ 3.2k

I guess this is my best answer?

How Can I Save Bioperl Sequence Nested Features In Genbank Or Embl Format?

ADD COMMENT

Login before adding your answer.

Traffic: 1972 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6