### 8 SEARCHING FOR INFORMATION IN DOCUMENTS

As well as allowing you to access documents directly by name (which is normally the fastest method if you know where to find the information you want), HTX also allows you to search for information by keyword.

#### 8.1 Performing Keyword Searches

Keyword searching is performed using the findme command, which is simply illustrated:

findme HTX

This command will search your documentation set for the string “HTX” and will then display a list of the documents found using your WWW browser, with each entry in the list being a hyper-link to the document in question. It is then a simple matter to follow the link to the document you want to read (in this example there will probably only be one document to choose from – this one).

The way in which the findme command performs its search is explained in the next section, but in essence it attempts to find information about major topics quickly by searching only the main titles of documents. It then goes on to consider more detailed (and time consuming) searches only for more obscure topics that can’t be found readily. The progress of the search is displayed on your terminal, so you can interrupt it if you fail to find what you want quickly and don’t want to wait.

The more level of detail findme needs to consider, the more detailed will be the list of results it generates, with individual HTML pages being listed if appropriate. This strategy of performing progressively deeper searches can be observed if you ask for information on something a little more obscure, like:

findme findme

which makes findme search for information on itself, or even something very obscure, like:

findme HTX_PATH

which will (probably) only be found in the body of the text of this document.

#### 8.2 Controlling the Depth of Search

The findme command allows you to find information at the level of detail you want by searching any of four categories of information associated with hypertext documents:

(1)
The document name (-n switch)
This is given by the name of the directory that contains the document(see §2.1). For example, the name of the document you are reading now is “sun188” because its hypertext version resides in a directory called “sun188.htx”. The name of a document may indicate what category of information it contains and is very quick and easy to search.
(2)
The document title (-t switch)
This is extracted from the “top” HTML page of a document (see §3.3). and consists of the text that appears between the $<$TITLE$>$ and $<$/TITLE$>$ tags in the HTML header section for that page. The title of a document is obviously a good place to search for important topics and this can be done quite quickly.
(3)
The document’s page headings (-h switch)
These are extracted in the same way as the document title (above), but from all the other HTML pages in the document, excluding the “top” page. If you have converted your document from a format such as LATEX (see §3.5 and SUN/199), then these will be the section headings that appear in the printed form of the document. This is normally a fruitful place to search for slightly more specialised topics and can be done without a serious time penalty because HTX caches this information in a document’s index file(see §4.1).
(4)
The lines of text in the document (-l switch)
These consist of the contents of all the HTML files in the document (including all their HTML tags, URLs, etc). This is the ultimate place to search for information, but this can take quite a while if the documentation set is large.

If none of the switches shown above is used when findme is invoked, its default action is to:

(1)
Search the document titles
(2)
If that fails to find a match, search the page headings
(3)
If that fails to find a match, search the lines of text

However, if one or more of the -n, -t, -h or -l switches is used, then only the specified categories of information will be searched, and this will be done in a single pass through all the documents. For instance:

findme -n sun

will cause only the document names to be searched (for the string “sun”), while:

findme -t sun

would search only the document titles, and:

findme -t -h sun

would search both the titles and page headings in a single pass.

#### 8.3 Searching Specific Documents

By default, the findme command will search your entire documentation set, consisting of all the documents found on the HTX_PATH search path (see §2.3) 5.

However, you can restrict the search to specific documents by listing them after the keyword you are searching for, thus:

findme targets sun188

would find information about “targets” in this document. This is often a useful way of finding reference information once you are reasonably familiar with a document’s contents. Restricting the search to a specific document will also make it far more rapid.

You may specify as many documents to search as you want. If you do not give explicit directory information, the HTX_PATH search path will be used to locate them.

#### 8.4 Other Search Options Available

The findme command has a number of options that allow you to fine-tune the search that it performs. These include:

• Performing case-sensitive searches (by default, keyword matching is case insensitive)
• Searching for whole words (by default the keyword you give will match any string, even if it is only part of a word)
• Matching patterns in text by including regular expressions in the keyword string
• Abbreviating the output list by suppressing information about individual HTML pages and only displaying document names and titles
• Sorting the output list into order according to the significance or number of matches found in each document.
• Including information about the number of matches found.

See the findme command description in §A for full details.

5It will also include the names and titles of documents found in catalogue files (see§10)