|
|
"Linux Gazette...making Linux just a little more fun!"
Writing Documentation, Part III: DocBook/XMLBy Christoph Spiel
To cite from ``DocBook -- The Definitive Guide'' (see Further Reading at the end of this section), DocBook provides a system for writing structured documents using SGML or XML. In the following, I shall focus on the XML-variant of DocBook, because the SGML-variant is being phased out. DocBook has been developed with a slightly different mindset than the systems I discussed in the two previous articles (POD article, LaTeX/latex2html article).
The particular features of DocBook mentioned, imply uses of DocBook documents that are not possible, at least not easily, with POD or LaTeX documents.
Being general purpose translators, both tools are not restricted to transforming DocBook documents. If you feed them the right style sheets, they will do other translations, too. SyntaxThe DocBook/XML syntax resembles HTML. The fundamental difference between the two being the strictness with which the syntax is enforced. Many HTML browsers are extremely forgiving about unterminated elements, and they often silently ignore unknown elements or attributes. DocBook/XML translators reject non-DTD complying input with detailed error messages, and refuse to produce any output in such cases. DocBook/XML is spoken in several variants, where the variants differ in
interpreting the closing tag of an element. The most verbose dialect always
closes Special characters are written with the ampersand-semicolon convention as they are in HTML. The most frequently used special characters are
Comments are bracketed between `` Document StructureAs already mentioned, DocBook documents must adhere to the structure that is defined in a DTD. Every document starts with selecting a particular DTD:
<!DOCTYPE (1)
book (2)
PUBLIC "-//OASIS//DTD DocBook XML V4.1//EN" (3)
"/usr/share/sgml/db41xml/docbookx.dtd" (4)
[ ] (5)
>
where I have broken the expression (from ``<'' to ``>'') into several lines for easier analysis, and added numbers in parentheses for reference. Part (1) tells the system that we are about to choose our DTD.
Part (2) defines element Now, we start the text with the root element, in our case What might look like a drag on first sight -- Rules? Rules suck! -- is the key to open up the document to programmatic access. As the document complies to the DTD, all post-processing can rely on that very fact. Good for the programmers of the post-processors! I have to admit that the number of elements and the elements' mutual relationships is tough to pick up. However, the relations are logical: a chapter contains one ore more (introductory) paragraphs and one or more Level 1 sections. No section, on the other hand, contains a chapter, that would be nonsense. Having a copy of ``The Definitive Guide'' right next to the keyboard also helps to learn DocBook. Further down, there is a short compilation of commonly used tags. Here comes a very short, but complete DocBook document.
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1//EN"
"/usr/share/sgml/db41xml/docbookx.dtd" []>
<book>
<bookinfo>
<title>XYZ (version 0.8.15) User's Manual</title>
</bookinfo>
<chapter id = "chapter-introduction">
<title>Introduction</title>
<para>
This chapter provides a quick introduction to XYZ.
</para>
<sect1 id = "section-syntax">
<title>Syntax</title>
<para>
In this section we present an outline of the
syntax of the XYZ language.
</para>
</sect1>
<sect1 id = "section-core-library">
<title>Core Library</title>
<para>
Even if no additional libraries are loaded to a
XYZ program, it has access to some core library
functions.
</para>
</sect1>
</chapter>
<chapter id = "chapter-commands">
<title>Commands</title>
<sect1 id = "section-interactive-commands">
<title>Interactive Commands</title>
<para>
...
</para>
<sect2 id = "section-interactive-commands-argumentless">
<title>Argumentless Commands</title>
<para>
...
</para>
</sect2>
</sect1>
<sect1 id = "section-non-interactive-commands">
<title>Non-Interactive Commands</title>
<para>
...
</para>
<sect2 id = "section-non-interactive-commands-argumentless">
<title>Argumentless Commands</title>
<para>
...
</para>
</sect2>
</sect1>
</chapter>
</book>
Useful TagsTo help the aspiring DocBook writer making sense of the loads of elements, the DocBook standard defines, I have compiled a bunch of useful tags, which are used often. Root Section TagsRoot section tags define the outermost element of any document.
Sectioning TagsSectioning elements divide the document into logical parts like chapters, sections, paragraphs, and so on.
List-Making TagsGenerate the three typical types of lists. The items or definitions are typically formed by one or more paragraphs, but they are allowed to contain program listings, too. The terms usually are one or more words, not paragraphs.
Inline Markup Tags
Cross ReferencesCross references refer to other parts of the same DocBook document or to
other documents on the World Wide Web. Targets of the former are all elements
that carry an
What I Have Left OutUgh, I left out tons of stuff, but only to give you a smooth, non-frightening introduction. Some great things DocBook handles that I have not discussed are
Also left out is everything related to changing the DTD or changing the style sheets. Pros and Cons
Further Reading
Next month: Texinfo
Chris runs an Open Source Software consulting company in Upper Bavaria, Germany.
Despite being trained as a physicist -- he holds a PhD in physics from Munich
University of Technology -- his main interests revolve around numerics,
heterogenous programming environments, and software engineering. He can be
reached at
cspiel@hammersmith-consulting.com.
|

Christoph Spiel