Indexes
An index is worth 1000 sentences. An index is a conceptual concentrate of a book. It presents, in alphabetic order, the concepts that are the subject of the book. The terms used in the index may be the same terms used within the book's text, or may be similar. They represent units of meaning, and a specific meaning can be expressed with a variety of words or phrases. Looking at an index may be even more telling than browsing through the table of contents of the book, to get an idea of what the book is about.
An index contains more than an alphabetic list of subjects. The subjects may be further subdivided into more detailed subjects. But we should not consider that an subentry indented below a term is a subitem which is a subcategory. Indexers know that most indexes are "flippable", i.e. that the subentries very often are entries as well. For example, if an entry named "History" has a subentry "United States", this does not mean that "United States" is a subcategory of "History". It is quite likely that in the same index, there will be an entry "United States" having as one of its subentries "History". In other words, an index that presents itself as a hierarchical list of entries and subentries does not usually represent a semantic hierarchy.
Each index entry is followed by one, or several, page numbers that contain the text referring to the subject described by the entry. A page is an arbitrary division of a book that is produced by the printing process.
Index entries can be related to other index entries. Two types of references co-exist: "See" refer to other entries that are considered synonymous, and the target of a See link is usually the main index entry, displaying the page numbers. "See also" is a link to another entry that is related to the entry of origin, and both entries display different page numbers.
When we created the Topic Maps model, we had extensive discussions in the working group on how to interpret the information within an index. These discussions ended up splitting the working group into two projects.
One group considered an index purely from a typographical perspective: entries, subentries, page numbers and links. They described each of these components with a different value and created a schema for the elements of the index. This group ended up creating a model for technical documentation, called Docbook, which described each editorial element the way it appeared in a book.
The other group focused on the conceptual meaning of an index and expanded its interpretation to include other classic navigational aids: tables of contents, glossaries, indexes, catalogs. The Topic Maps architecture defines a topic graph in which nodes are units of meaning. Each of these navigational aid is a filtered view in this topical space. The page numbers are links to document fragments, which could be defined in other ways than pages. They could be for example paragraphs, or sections. In the topic map model, this specific kind of link is called an "occurrence" of a topic, linking the topic node to all locations where its meaning appears. The "see" and "see also" relations in an index are seen as specific kinds of relationships between topics. The connection between an entry and a subentry can also be described as a relation between two distinct topics.