Gentoo DevBook XML guide
DevBook XML design goals
The DevBook XML syntax is lightweight yet expressive, so that it is easy to learn yet also provides all the features we need for the creation of web documentation. The number of tags is kept to a minimum — just those we need. This makes it easy to transform DevBook XML into other formats, such as DocBook XML/SGML or web-ready HTML.
The goal is to make it easy to create and transform DevBook XML documents.
Basic structure
Let's start learning the DevBook XML syntax. We'll start with the initial tags used in a DevBook XML document:
<?xml version="1.0" encoding="UTF-8"?>
<devbook self="appendices/devbook-guide/">
<chapter>
<title>Gentoo DevBook XML guide</title>
On the first lines, we see the XML declaration that identifies this as an XML
document. Next, there's a <devbook>
tag — the entire document is
enclosed within a <devbook> </devbook>
pair. Its self
attribute must point to the relative path of the document from the root node;
in the example above the path is appendices/devbook-guide/
. An exception
is the root node itself, which has <devbook root="true">
instead.
Next, there is a <chapter>
tag. Every document must have exactly
one chapter. Its <title>
is used to set the title for the entire
document.
All elements must be closed of course, so the document ends with:
</chapter>
</devbook>
Sections and subsections
Once the initial tags have been specified, you're ready to start adding the structural elements of the document. Chapters are divided into sections; each section can hold zero or more subsections, which can contain zero or more subsubsections. Section, subsection and subsubsection elements must have a title. Here's an example section with a single subsection, consisting of a paragraph:
<section>
<title>This is my section</title>
<subsection>
<title>This is subsection one of my section</title>
<body>
<p>
This is the actual text content of my subsection.
</p>
</body>
</subsection>
</section>
Above, I set the section title by adding a child <title>
element
to the <section>
element. Then, I created a subsection by adding
a <subsection>
element. If you look inside the
<subsection>
element, you'll see that it has two child elements
— a <title>
and a <body>
. While the
<title>
is nothing new, the <body>
is —
it contains the actual text content of this particular subsection. We'll look
at the tags that are allowed inside a <body>
element in a bit.
<chapter>
, <section>
, <subsection>
and <subsubsection>
elements contain a <body>
and/or
any number of section elements of the next lower level. Skipping of levels is
not allowed, e.g., a subsection cannot be directly below a chapter.
Including sub-documents
The manual is organized as a tree. Each directory contains one document, which
can include multiple sub-documents using the <include href="foo/"/>
tag. Note that the trailing slash in the href
value is mandatory.
A table of contents can be generated with <contents/>
.
Typically, this would be the only element in its own section body, as in
the following example:
<section>
<title>Contents</title>
<body>
<contents/>
</body>
</section>
An example <body>
Now, it's time to learn how to mark up actual content. Here's the XML code for
an example <body>
element:
<p>
This is a paragraph. <c>/etc/passwd</c> is a file.
<uri>https://www.gentoo.org/</uri> is my favorite website.
Type <c>ls</c> if you feel like it. I <e>really</e> want to go to sleep now.
</p>
<pre>
This is text output or code.
# this is user input
</pre>
<codesample lang="sgml">
Make HTML/XML easier to read by using selective emphasis:
<foo>bar</foo>
</codesample>
<note>
This is a note.
</note>
<important>
This is important.
</important>
<warning>
This is a warning.
</warning>
<todo>
Text inside a <c>todo</c> element will appear in the
<uri link="::appendices/todo-list/"/>.
</todo>
Now, here's how the <body>
element above is rendered:
This is a paragraph. /etc/passwd
is a file.
https://www.gentoo.org/ is my favorite web site.
Type ls
if you feel like it. I really want to go to sleep now.
This is text output or code. # this is user input
Make HTML/XML easier to read by using selective emphasis:
<foo>bar</foo>
todo
element will appear in the
TODO list.
Body elements
We introduced a lot of new tags in the previous section — here's what you
need to know. The <p>
(paragraph), <pre>
(preformatted block), <codesample>
(code block),
<note>
, <important>
, <warning>
and
<todo>
tags all can contain one or more lines of text.
Besides the <figure>
, <table>
, <ul>
,
<ol>
and <dl>
elements (which we'll cover in just
a bit), these are the only tags that should appear immediately inside a
<body>
element. Another thing — these tags should not
be stacked — in other words, don't put a <note>
element inside
a <p>
element. As you might guess, the <pre>
and
<codesample>
elements preserve their whitespace exactly, making
them well-suited for code excerpts. Both <pre>
and
<codesample>
can have a caption
attribute:
<pre caption="Output of uptime">
# uptime
16:50:47 up 164 days, 2:06, 5 users, load average: 0.23, 0.20, 0.25
</pre>
Code samples and colour-coding
The <pre>
tag does not support any syntax highlighting. When you
need syntax highlighting, use the <codesample>
tag along with a
lang
attribute — usually you want this to be set to "ebuild"
to syntax highlight ebuild code snippets. Currently, the following languages
are supported:
- c
- ebuild
- make
- m4
- sgml
<codesample>
blocks will appear in the displayed html page.
Sample <codesample lang="c">
block:
#include <stdio.h>
main()
{
/* This is a comment */
printf("Hello, world!\n");
}
You can also specify numbering="lines"
to enable line numbering, as in
the following example:
01: # Copyright 1999-2021 Gentoo Authors
02: # Distributed under the terms of the GNU General Public License v2
03:
04: EAPI=7
05:
06: DESCRIPTION="MicroGnuEmacs, a port from the BSDs"
07: HOMEPAGE="https://homepage.boetes.org/software/mg/"
08: SRC_URI="https://github.com/hboetes/${PN}/archive/${PV}.tar.gz -> ${P}.tar.gz"
09:
10: LICENSE="public-domain"
11: SLOT="0"
12: KEYWORDS="alpha amd64 arm hppa ppc ~ppc64 sparc x86"
13:
14: RDEPEND="sys-libs/ncurses:0=
15: >=dev-libs/libbsd-0.7.0"
16: DEPEND="${RDEPEND}"
17: BDEPEND="virtual/pkgconfig"
18:
19: src_install() {
20: dobin mg
21: doman mg.1
22: dodoc README tutorial
23: }
Figures
Here's how to insert a figure into a document — <figure
link="mygfx.png" short="my picture" caption="my favorite picture of all
time"/>
. The link
attribute points to the actual graphic image,
the short
attribute specifies a short description (currently used for
the image's HTML alt
attribute), and a caption. Not too difficult
:) We also support the standard HTML-style <img src="foo.gif"/> tag
for adding images without captions, borders, etc.
Tables
DevBook XML supports a simplified table syntax similar to that of HTML. To start
a table, use a <table>
tag. Start a row with a <tr>
tag. However, for inserting actual table data, we don't support the HTML
<td> tag; instead, use the <th>
if you are inserting a
header, and <ti>
if you are inserting a normal informational
block. You can use a <th>
anywhere you can use a <ti>
— there's no requirement that <th>
elements appear only in the
first row.
Besides, both table headers (<th>
) and table items
(<ti>
) accept the colspan
and rowspan
attributes to
span their content across rows, columns or both.
Furthermore, table cells (<ti>
& <th>
) can be
right-aligned, left-aligned or centered with the align
attribute.
This title spans 4 columns | |||
---|---|---|---|
This title spans 6 rows | Item A1 | Item A2 | Item A3 |
Item B1 | Blocky 2x2 title | ||
Item C1 | |||
Item D1..D3 | |||
Item E1..F1 | Item E2..E3 | ||
Item F2..F3 |
Lists
To create ordered or unordered lists, simply use the XHTML-style
<ol>
, <ul>
and <li>
tags. Lists may only
appear inside the <body>
and <li>
tags which means
that you can have lists inside lists. Don't forget that you are writing XML and
that you must close all tags including list items unlike in HTML.
Definition lists (<dl>
) are also supported. Please note that
the definition term tag (<dt>
) does not accept any other block
level tag such as paragraphs or admonitions. A definition list comprises:
<dl>
- A Definition List Tag containing
<dt>
- Definition Term Tags
<dd>
- and Definition Data Tags.
The following example copied from w3.org shows that lists may also be nested and different list types may be used together:
- The ingredients:
-
- 100 g flour
- 10 g sugar
- 1 cup water
- 2 eggs
- salt, pepper
- The procedure:
-
- Mix dry ingredients thoroughly.
- Pour in wet ingredients.
- Mix for 10 minutes.
- Bake for one hour at 300 degrees.
- Notes:
- The recipe may be improved by adding raisins.
Inline elements
<c>, <b>, <e>, <sub> and <sup>
The <c>
element is used to mark up a command or user
input. Think of <c>
as a way to alert the reader to something
that they can type in that will perform some kind of action. For example,
all the XML tags displayed in this document are enclosed in a <c>
element because they represent something that the user could type in that is
not a path. By using <c>
elements, you'll help your readers
quickly identify commands that they need to type in. Also, because
<c>
elements are already offset from regular text, it is rarely
necessary to surround user input with double-quotes. For example, don't
refer to a "<c>
" element like I did in this sentence. Avoiding
the use of unnecessary double-quotes makes a document more readable — and
adorable!
As you might have guessed, <b>
is used to boldface some
text.
<e>
is used to apply emphasis to a word or phrase; for example:
I really should use semicolons more often. As you can see, this text is
offset from the regular paragraph type for emphasis. This helps to give your
prose more punch!
The <sub>
and <sup>
elements are used to specify
subscript and superscript.
<uri>
The <uri>
tag is used to point to files/locations on the Internet.
It has two forms — the first can be used when you want to have the actual
URI displayed in the body text, such as this link to
https://www.gentoo.org/. To create this link, I typed
<uri>https://www.gentoo.org/</uri>
. The alternate form is
when you want to associate a URI with some other text — for example,
the Gentoo Linux website. To create
this link, I typed <uri link="https://www.gentoo.org/">the
Gentoo Linux website</uri>
.
Please avoid the click here syndrome as recommended by the W3C.
Intra-document references
DevBook XML makes it really easy to reference other parts of the document using
hyperlinks. You can create a link pointing to another chapter, like
Ebuild file format, by typing
<uri link="::ebuild-writing/file-format/">Ebuild file
format</uri>
, i.e. two colons followed by the relative path from
the root node. To refer to a section in another chapter, like
First ebuild, type
<uri link="::quickstart/#First ebuild">First ebuild</uri>
.
If the link target's chapter (or section etc.) title is to be used as the link
text, an empty <uri>
element can be used. As a matter of fact,
I could have written the two examples above in more compact form:
<uri link="::ebuild-writing/file-format/"/>
and
<uri link="::quickstart/#First ebuild"/>
render as
Ebuild file format and
First ebuild, respectively.
Coding style
Since all Gentoo Documentation is a joint effort and several people will most likely change existing documentation, a coding style is needed. A coding style contains two sections. The first one is regarding internal coding — how the XML-tags are placed. The second one is regarding the content — how not to confuse the reader.
Both sections are described next.
Internal coding style
Newlines must be placed immediately after every DevBook XML tag
(both opening and closing), except for:
<title>
,
<th>
, <ti>
, <li>
,
<dt>
, <dd>
,
<b>
, <c>
, <e>
, <d/>
,
<uri>
.
Blank lines must be placed immediately after every
<body>
(opening tag only) and before every
<section>
, <p>
, <pre>
,
<codesample>
, <figure>
, <table>
,
<ul>
, <ol>
, <dl>
, <note>
,
<important>
and <warning>
(opening tags only).
An exception to this rule applies to tags that are located within list items
or table cells.
Word-wrapping must be applied at 80 characters except inside
<pre>
and <codesample>
. You may only deviate from
this rule when there is no other choice (for instance when a URL exceeds the
maximum amount of characters). The editor must then wrap whenever the first
whitespace occurs. You should try to keep the rendered content of
<pre>
and <codesample>
elements within 80 columns
to help console users.
Indentation may not be used, except with the XML-constructs of which the
parent XML-tags are <tr>
(from <table>
),
<ul>
, <ol>
and <dl>
. If indentation
is used, it must be two spaces for each indentation. That means
no tabs and not more spaces. Besides, tabs are not allowed in
DevBook XML documents (again, except for <pre> and <codesample>).
In case word-wrapping happens in <ti>
, <th>
,
<li>
or <dd>
constructs, indentation must be used for
the content.
An example for indentation is:
<table>
<tr>
<th>Foo</th>
<th>Bar</th>
</tr>
<tr>
<ti>This is an example for indentation</ti>
<ti>
In case text cannot be shown within an 80-character wide line, you
must use indentation if the parent tag allows it
</ti>
</tr>
</table>
<ul>
<li>First option</li>
<li>Second option</li>
</ul>
Opening tags with a single attribute may not be split between lines.
For example, don't put a newline between <uri
and its link
attribute. Break the line before the <uri>
tag instead.
Attributes may not have spaces in between the attribute, the "=" mark, and the attribute value. As an example:
Wrong : <uri link = "https://www.gentoo.org/">
Correct: <uri link="https://www.gentoo.org/">
Dashes used as in-sentence punctuation — like this — should be
written as a <d/>
tag surrounded by spaces. It would be difficult
to distinguish a Unicode em-dash from a hyphen when editing the source using a
fixed-width font.
External coding style
Inside tables (<table>
) and listings (<ul>
,
<ol>
and <dl>
), periods (".") should not be used
unless multiple sentences are used. In that case, every sentence should end
with a period (or other reading marks).
Every sentence, including those inside tables and listings, should start with a capital letter.
<ul>
<li>No period</li>
<li>With period. Multiple sentences, remember?</li>
</ul>
Titles should use sentence case, i.e. their first word should start with a capital letter, and all other words (except proper nouns) should be in lower case.
Try to use <uri>
with the link
attribute as much as
possible. In other words, the
Gentoo website is preferred over
https://www.gentoo.org/.