10 CATALOGUE FILES

 10.1 Catalogue File Name and Format
 10.2 A Catalogue File Example
 10.3 How Catalogue Files are Searched
 10.4 Providing a Local Catalogue of Remote Documents

Catalogue files provide a way of introducing documents into the searches performed by the findme command which it would not otherwise be able to search. This is of particular benefit if some documents are not stored locally, or are not in hypertext format.

10.1 Catalogue File Name and Format

An HTX catalogue file is a text file with the name “htx.catalogue” which resides in a document library. Each line of the file should contain an entry consisting of three fields separated by white space, in the form:

  docname docfile title_text

where:

10.2 A Catalogue File Example

Suppose, for example, that you are converting an existing documentation set into hypertext form, but still have some documents available only in DVI and postscript format (with file extensions “.dvi” and “.ps”). The findme command will not be able to search these “old” documents because it doesn’t know how to extract (for example) their titles from the files provided. To help overcome this, you would describe these documents in a catalogue file, perhaps along the following lines:

  review doc1.ps A Review of Documentation Systems
  intro doc2.dvi Introduction to Hypertext
  writing writing.ps How to Communicate Effectively
  ...

Note that the document name and file name need not match. This file introduces the listed documents to the findme command, tells it where to find the corresponding document files and allows it to perform searching by document name and/or title (but not by page heading or lines of textual content, since it cannot know how to decode the document format to obtain these).

10.3 How Catalogue Files are Searched

Documents listed in HTX catalogue files are added into the documentation set after all hypertext documents have first been found using the HTX_PATH search path (see §2.3). If a document is found in hypertext form, it occludes any subsequence occurrence of a document with the same name in a catalogue file. This means that if you convert an “old” document into hypertext form (with a “.htx” file extension), the new version will automatically be found in preference to the old one – there is no need to remove it from the catalogue file.

Catalogue files are also found by following the HTX_PATH search path after it has been used to find hypertext documents, and are recognised by the name “htx.catalogue”. If more than one catalogue file is found, their contents are simply concatenated in the order in which they are found.

Duplicate entries for a document are permitted in catalogue files (and can also arise when catalogue files are concatenated). They provide a mechanism for a document to have alternative titles. This can sometimes improve the usefulness of document searches if the original title lacks any useful keywords (you might think of this as combining both a title index and a subject index into the same file). If more than one title entry is matched for a particular document, then the one that occurs first in the catalogue file(s) is used.

10.4 Providing a Local Catalogue of Remote Documents

The documents listed in HTX catalogue files need not necessarily exist on the local file system. HTX will check to see if they do, and will generate hyper-links to them if they appear to be readable. For files that are not accessible, however, it will generate a reference to the remote document server (see §6). This reference will take the standard form6.1). using the document name – not the file name used in the the catalogue file.

Catalogue files can therefore be used as a searchable catalogue of documents that are available remotely. In fact, a document library containing only a catalogue file could be searched by the findme command and any matches would then refer to the remote version of the document, in whatever form it happens to be stored.