13 Creating new structures
The methods presented in this section are intended to ensure that the rules given in Section 2 are
adhered to. Particularly, it makes sure that standard structures are used wherever possible, and
therefore encourages the building of new structures (when needed) by gathering together existing
structures, so ensuring the maximum commonality.
New structures must not be constructed simply by adding new components into standard ones
(which is illegal), but instead by adding a new layer. The HDS hierarchy then provides a natural
barrier between separate structures, and ensures that further components can later be added at any
level without risking naming conflicts.
13.1 Definitions
In the description of the design process the total hierarchy of structures is called a dataset to
distinguish it from a single component structure. It is equivalent to the contents of an HDS container
file. The term structure refers to a set of related data items; it corresponds to a single level of a dataset.
Often a dataset will consist simply of a single structure.
13.2 Algorithm
A summary of the algorithm to be used when creating a new dataset is given below. The circled
numbering refers to expanded notes in the next subsection.
Define what the software is going to do and not going to do 1
|
Identify, in concept, the datasets required |
Determine their interrelations, perhaps via a tree diagram 2 |
Start at the most deeply nested level of the hierarchy |
| Identify the data components 3 |
| if an existing standard structure can be used then |
| Place remaining associated items in a new structure or in an extension |
| Assign a unique HDS TYPE to the new structure 4 |
| Assign a NAME to each component 5 |
| Determine the rules and restrictions governing the way the data will be stored in the |
| Assign a TYPE to each component 6 |
| Identify the sorts of operation to be performed on the structure and ensure they are |
| if the processing of a component cannot be defined in some cases then remove the |
| component from the structure |
| Implement and document the software needed to process the new structure 8 |
| if the new structure is to become a standard type then |
| Submit it and its software to Starlink for approval 9 |
13.3 Explanatory Notes
-
(1)
-
- It is important to define the scope of the software initially and not to let it expand
arbitrarily during implementation. If the software design does subsequently need to
be revised, then the dataset may also need re-designing.
- During the following stages of the design process, the original outline of the dataset
may prove to be incorrect or inadequate, especially in more-complex hierarchies. In
such cases you have to start again.
-
(2)
-
- The interrelations between structures specify how they should be organised
hierarchically. Drawing a dendrogram should help.
- The design process is bottom-up. Multiple-level datasets are built up from from the
lowest (most deeply nested) level of the hierarchy. Design each and every structure
at the current HDS level before going to the next higher level.
-
(3)
-
- Check to see whether any of the required data components are standard structures
or are components of standard structures. If suitable standard structures already
exist, then use them. If not, and you have to design new structures, try to make them
general so that they might later become standard structures themselves.
- The original dendrogram design of the dataset may grow some extra branches if
standard structures can be used, because there may be a net increase in the number
of structures.
- Certain standard structures include provision for extension structures, and may thus
be used even if there is no appropriate place in the standard structure itself for some
of the items to be stored.
- Using existing standard structures gives the obvious advantages of being able to
use existing software. (Starlink will maintain a list of standard structures and their
conventions.) The standards and conventions associated with standard structures
must be observed by all new software which uses them.
-
(4)
-
- The TYPE should reflect the sort of data to be held in the structure, but must not
conflict with the TYPE of any other standard data structure. Starlink management
should be consulted when defining TYPEs.
-
(5)
-
- The names should preferably identify the rôle which each component plays.
Although the name will have no global significance outside the structure, it may still
be sensible to have a naming convention for certain common types of structure to
avoid confusion.
- There may be any number of rules and conventions governing use of the structure.
For instance, some components may be optional, and the presence of some
components may depend on others (as with the VARIANT concept). These rules
must be explicitly stated and obeyed by all software which uses the structure. If
this software is likely to be written by many different people, then the rules should
obviously be kept simple.
-
(6)
-
- Only primitives or structures of a TYPE already defined may be used.
- Since only defined TYPEs (which includes primitives) may be used, any
substructures must already have been defined along with the rules for processing
them. It might also occasionally be appropriate to define structures “recursively” by
including components of the same TYPE as that being defined.
- Often, standard subroutines will already exist for processing the data components
from which the new structure is being built, and these can therefore be used to
process the components of the new structure.
-
(7)
-
- Ensure that all the operations are meaningfully defined in terms of what will happen
to each component when the structure is processed. Consider all valid combinations
of structure components.
- Many packages “grow” indefinitely, so it may not be possible to enumerate all possible
operations. However, if the initial (global) stage of the design was obeyed, it should
be possible to identify them as broad classes, such as [image display, arithmetic,
spatial smoothing],
or [create history, append history, search for history record].
- It may be necessary to reject some components if you cannot meaningfully define
what will happen to them in all circumstances.
-
(8)
-
- The software should obey all the conventions appropriate to the new structure
(and any other structures it uses). When accessing a structure, software should first
check its TYPE—this specifies how the structure contents are to be interpreted. Any
component not covered by the structure definition should be completely ignored.
- From time to time, ignorance and independence of spirit will no doubt lead
implementors and users into inserting extra components into a structure, but these
are illegal and will be ignored. This is not a valid way of defining a new structure.
-
(9)
-
- If the new structure is to become a standard type, submit the design (providing
details of the NAME, TYPE, meaning and processing rules for each data object) to
the Starlink Head of Applications for approval and registration. If appropriate, a
subroutine interface should be written for handling the structure; this would ensure
that the conventions governing its use are enforced. Any associated software should
also be submitted to the Head of Applications.
- Once a new standard structure has been accepted, anyone is free to use the structure
and to incorporate it in any new structures he or she may create. Once this point
is reached, it may be difficult to change the structure definition without upsetting
somebody; changes in the form of additions to the structure are the least likely to
cause trouble.
13.4 Extensions
Once a “core” of fairly simple standard structures exists, the process of designing more specialised
structures will be devolved to the various SIGs, who can use the simpler structures as building blocks.
This avoids the problems of the ‘all or nothing’ monolithic approach. When a more complex (and
therefore highly specialised) structure is built out of simpler ones, software will then automatically
exist for processing all its substructures in a more general way. This should give a high degree of
flexibility.
There will be independent extensions, each having a uniquely defined TYPE together with rules for its
interpretation. Though many extensions will be independent and self-contained, some will form
hierarchies. The design of each extension should be kept straightforward and appropriate to the kind
of software which will use it. Simple and specialised, simple and general, and complex and specialised
are all acceptable, but implementors should beware of attempting to design extensions which are both
complex and general. By introducing a strict criterion to decide whether a given component is
acceptable (“do we know how to process it?”), it is ensured that the problem is broken
into manageable pieces, the complexity of which does not exceed our software-writing
abilities.