Gentoo Logo

Gentoo Metadoc XML Guide


1.  Introduction

Why is MetadocXML Needed?

MetadocXML is not needed, it's an additional resource for the Gentoo Documentation Project to keep track of documents, even if they are located outside of the normal [gentoo]/xml/htdocs/doc scope.

Thanks to MetadocXML, we can now

  • track documents that are located inside project webspaces (/proj) instead of the usual documentation repository (/doc)
  • categorize documentation into various categories (or subcategories) with the additional benefit that we can now automatically generate the documentation index (and more)
  • track unofficial documentation team members (such as translators)
  • use parts of big documents (Handbooks) as individual guides on certain topics
  • assign bugs to particular documents for quick reference and with the possibility of masking out a document in case of a major showstopping bug
  • primitively check if a translated file is in sync with its English counterpart or not

Note that the last advantage is primitive and will probably not be extended. Some translation teams use scripts based on trads-doc to manage translations, others use online translation management tools. If you are starting up translations for Gentoo, pop by on #gentoo-doc or ask the mailinglist for help.

Translation teams that do not use MetadocXML yet don't need to worry - they will not lose any current functionality as it only builds upon the existing infrastructure - there are no changes to the GuideXML format that need MetadocXML.

How does MetadocXML Work?

There is one central file in which all meta information on the documentation is maintained. We call this file metadoc.xml. This file should be located inside your main repository (/doc/${LANGUAGE}) although this is not hard-coded.

Inside this file, all meta information is stored:

  • Members of the team
  • Categories in which documents participate
  • Files that are covered
  • Documents that are covered
  • Bugs that are part of a document

Next to metadoc.xml, one also can have a dynamically generated index file (usually called index.xml), an overview listing of all documentation (usually called list.xml) and an overview listing of all members, files and bugs (usually called overview.xml).

2.  The metadoc.xml File

XML Structure

The metadoc.xml file is started with the usual XML initialisation code and Gentoo CVS header information:

Code Listing 2.1: XML Initialisation

<?xml version='1.0' encoding="UTF-8"?>
<!-- $Header: /var/cvsroot/gentoo/xml/htdocs/doc/en/metadoc.xml,v 1.25 2004/12/23 09:51:30 swift Exp $ -->
<!DOCTYPE metadoc SYSTEM "/dtd/metadoc.dtd">

Then, one starts with the MetadocXML declaration.

Code Listing 2.2: English MetadocXML declaration

<metadoc lang="en">

Translators should reference the main /doc/en/metadoc.xml in the parent attribute. This lets metadoc identify untranslated files and find out whether versions of translated versions and originals still match.

Code Listing 2.3: Translated MetadocXML declaration

<metadoc lang="language code" parent="/doc/en/metadoc.xml">

Beneath the metadoc entity, the following entities should be declared (in the given order):

  • version to help keep track of changes
  • members which declares all members of the given language team
  • categories which declares the possible categories used
  • files which contains all files covered by the Metadoc file
  • docs which contains all documents covered by the Metadoc file

The Version Entity

The version number should be increased when a document or a file is added or removed, when a path is changed or on any update that might have an impact on translated versions.

The Members Entity

Inside the members entity, one can declare two 'types' of members: lead and member. A lead should be known by the Gentoo Developers Relations as it takes only the nickname of the Lead developer and looks it up in the Gentoo Memberlist. A member can either be a Gentoo Developer (in which case only a nickname is given) or a contributor.

In case of a contributor, a member tag is given two attributes, mail and fullname, containing the contributor's e-mail address and full name.

Code Listing 2.4: Example use of the members entity

  <member mail="" fullname="John Doe">jdoe</member>

The Categories Entity

Inside the categories entity one only declares cat entities. Each cat entity covers one Category. It uses one mandatory parameter id which is used to reference the category. You can also define a parameter parent in case the category is a child of another category.

In this case, the parent attribute references the id attribute of the parent category.

Code Listing 2.5: Example use of the categories entity

  <cat id="faq">Frequently Asked Questions</cat>
  <cat id="install">Installation Related Resources</cat>
  <cat id="install_guides">Installation Guides</cat>

The Files Entity

The files entity contains only file entities.

Each file entity references a single XML file. It has a mandatory id attribute which should be seen as a primary key to lookup the file. Metadoc will compare the file name defined with the same id attribute in the metadoc's parent file (defined in the root element) to find out whether the file is a translation or an untranslated file. File names would be identical in the latter case.

The metadoc file itself can be listed and will appear on the overview page.

Code Listing 2.6: Files entity examples

  <file id="metadoc">/doc/en/metadoc.xml</file>
  <file id="ati-faq">/doc/en/ati-faq.xml</file>

The Docs Entity

The docs entity should only contain doc entities.

Each doc entity has a mandatory fileid attribute, which refers to the id attribute of a file entity corresponding with the main file for the document.

In case of a handbook chapter, the doc entity must contain a bookref entity which references the main handbook page (the top handbook XML file). This entity then contains two attributes, called vpart and vchap which refer to the corresponding part and chapter of the document inside the handbook.

Inside the doc entity, two other entities are possible:

  • One or more memberof entities, referring to the category or categories in which the document is located (note that a document can be in several categories at once)
  • One bugs entity containing one or more bug entities. A bug entity refers to a bugnumber that covers a bug in the document. In case of a major bug, one can add the attribute stopper="yes" to the bug entity in order for the document not to appear on the generated index page.

Code Listing 2.7: Example Docs entity

  <doc fileid="ldap-howto">
      <bug stopper="yes">1151330</bug>
  <doc fileid="uml">

Example metadoc.xml file

The Gentoo site uses a metadoc.xml file to aggregate information of all its documentation. You can view the current version one online.

3.  The Additional MetadocXML Files

Automatically Generated Index

When you want an automatically generated index, you should start the document with the following code:

Code Listing 3.1: Dynamically Generated Index

<?xml version='1.0' encoding='UTF-8'?>
<?xml-stylesheet href="/xsl/metadoc.xsl" type="text/xsl"?>
<?xml-stylesheet href="/xsl/guide.xsl"   type="text/xsl"?>

<!-- $Header$ -->

<!DOCTYPE dynamic SYSTEM "/dtd/metadoc.dtd">

<!-- Substitute "/doc/en/metadoc.xml" with the location of your metadoc file -->
<dynamic metadoc="/doc/en/metadoc.xml">
<title>Gentoo Documentation Resources</title>





In between the intro tags you should write one or more sections which will always appear on the top of the page. You will probably want to write an introduction and some additional information for the reader to know who to contact in case of translation mishaps or other issues.

Inside the intro tags you can use plain GuideXML starting from section.

The catid tags refer to the main categories used by the dynamical index. You should list each possible non-child category that is declared in your metadoc file. Do not list categories that are children of another category.

Dynamically Generated List Document

A dynamically generated list document starts identically to a dynamically generated index file:

Code Listing 3.2: Dynamically generated list document

<?xml version='1.0' encoding='UTF-8'?>
<?xml-stylesheet href="/xsl/metadoc.xsl" type="text/xsl"?>
<?xml-stylesheet href="/xsl/guide.xsl"   type="text/xsl"?>

<!-- $Header$ -->

<!DOCTYPE dynamic SYSTEM "/dtd/metadoc.dtd">

<!-- Substitute "/doc/en/metadoc.xml" with the location of your metadoc file -->
<dynamic metadoc="/doc/en/metadoc.xml">
<title>Gentoo Documentation Listing</title>

However, there is no intro tag. What needs to be added are all the top categories used by the listing. To differentiate this from the index (which will also display the abstract information on each document) this happens between list tags inside listing:

Code Listing 3.3: Listing of categories


Dynamically Generated Overview Document

The overview document is started similarly as the two documents decribed above:

Code Listing 3.4: Dynamically generated overview document

<?xml version='1.0' encoding='UTF-8'?>
<?xml-stylesheet href="/xsl/metadoc.xsl" type="text/xsl"?>
<?xml-stylesheet href="/xsl/guide.xsl"   type="text/xsl"?>

<!-- $Header$ -->

<!DOCTYPE dynamic SYSTEM "/dtd/metadoc.dtd">

<!-- Substitute "/doc/en/metadoc.xml" with the location of your metadoc file -->
<dynamic metadoc="/doc/en/metadoc.xml">
<title>Documentation Development Overview</title>

You can again write up a small introduction in GuideXML between the intro XML tags, starting from a section up. Once that is finished, a single tag <overview/> is sufficient.

Code Listing 3.5: Intro and overview tags




Page updated September 4, 2011

Summary: This guide informs developers how to use the Metadoc XML format that allows the Gentoo Documentation Project to keep its documentation in a hierarchical manner and allow more information to be stored about each document.

Sven Vermeulen

Xavier Neys

José María Alonso

Donate to support our development efforts.

Copyright 2001-2015 Gentoo Foundation, Inc. Questions, Comments? Contact us.