Pragmatically, I would use Emacs org-mode to write a master document and 'tangle' the SQL DDL, LaTeX and HTML documentation out of the single source. The advantage is that it's very simple to do (and org-mode is included in Emacs 23).
First ensure that SQL is enabled for org-mode, in the .emacs file:
(org-babel-do-load-languages 'org-babel-load-languages
'((dot . t) ; Graphviz dot
(sh . t)
(python . t)
(perl . t)
(ruby . t)
(R . t)
(sql . t))) ; add this pair if necessary
Here's the text input:
* My example schema
Schema overview goes here.
** Organism table
Explain the purpose of the 'organism' table.
#+begin_src sql :tangle schema.ddl :comments org
CREATE TABLE organism (
id INTEGER PRIMARY KEY,
taxid INTEGER NOT NULL,
name varchar(512) NOT NULL
);
#+end_src
** Chromosome table
Explain the purpose of the 'chromosome' table.
#+begin_src sql :tangle schema.ddl :comments org
CREATE TABLE chromosome (
id INTEGER PRIMARY KEY,
name varchar(128) NOT NULL,
length INTEGER NOT NULL
);
#+end_src
*** Organism constraint
Explain the purpose of a foreign key.
#+begin_src sql :tangle schema.ddl :comments org
ALTER TABLE chromosome ADD COLUMN organism INTEGER NOT NULL;
ALTER TABLE chromosome ADD CONSTRAINT chromosome_organism_fk \
FOREIGN KEY (organism) REFERENCES organism(id);
#+end_src
I can now generate the DDL (in this case, including docs as SQL comments), HTML and PDF via LaTeX. Since you get all the org-mode features, you can insert or generate other elements such as diagrams, cross-references and so on. Here's the DDL generated in the schema.ddl file when running M-x org-babel-tangle.
-- Organism table
-- Explain the purpose of the 'organism' table.
CREATE TABLE organism (
id INTEGER PRIMARY KEY,
taxid INTEGER NOT NULL,
name varchar(512) NOT NULL
);
-- ** Chromosome table
-- Explain the purpose of the 'chromosome' table.
CREATE TABLE chromosome (
id INTEGER PRIMARY KEY,
name varchar(128) NOT NULL,
length INTEGER NOT NULL
);
-- *** Organism constraint
-- Explain the purpose of a foreign key.
ALTER TABLE chromosome ADD COLUMN organism INTEGER NOT NULL;
ALTER TABLE chromosome ADD CONSTRAINT chromosome_organism_fk \
FOREIGN KEY (organism) REFERENCES organism(id);
You can also go the other way and include executable SQL queries in the org-mode file which insert their results back into that file when they are run (instead of producing source code as above). This way real examples and results may be merged back into the documentation source.
Are there tools available that apply this idea?
I understand that, I've used Protege a lot. Ontology development is a specialised skill - if you're recommending this approach, I hope that I wouldn't have to first devise and test an ontology, then write a tool to map my relational model to OWL (or whatever) and then write a rendering tool to give human-readable documentation (in this case).
No, you don't. Ontologies can be complex, using upper ontologies, etc. But despite what some want people to believe, the can be small, simple, and just reflecting your schema. In fact, that is what it recommended by people in the Health Care and Life Sciences working group for the adoption of semantic web technologies (like OWL). KISS is perfectly fine in RDF. Just look at FOAF, or the RDF used by Google and Yahoo to enrich search results.
Well strictly speaking, RDF is completely schemaless and DOES NOT require ANY ontology....it just helps to find your way round...
The documentation of what is what in the relational database is done via an ontology, e.g. tables would be classes, rows could be instances or subclasses, and columns are predicates. There are many tools availale, Protege would be the best know to define the ontology used to describe the database schema.
Do you express the semantics of your RDB schema as a specialised ontology (created e.g. with Protege) and then use that to generate your SQL DDL and accompanying documentation? Perhaps I should open this as a question...
I added a very basic approach in my answer.
I added a very basic approach in my answer. No need for Protege, but some people like GUIs. The example is, in fact, not using OWL, but merely RDF Schema.
Thanks for your patience. That's a lot clearer to me now.
My apologies for being too brief to start with :)
Oh, also added a link to the D2RQ tool (first paragraph) which my might find interesting.
Yes, important point. RDF gives flexibility here.