Key fields and data fields¶
When a new data dictionary is created, it is necessary to specify which fields are to be the key fields for the new file.
It is important that the construction of a file’s key be properly considered so as to maximise the efficiency and speed of access to the data in the file.
The key for a record is composed of the values it contains in each key field, concatenated in the order in which they are listed in the data dictionary. Records are stored in ascending order of this concatenated key. The order in which key fields are defined is therefore crucial. The first key field is the most significant, as records are sorted primarily by the value they contain in this field. Records with identical values for the first key are sorted by the value of the second key, and so on.
It is not possible to store two records with identical keys, so clearly it should be expected that the key data will be unique for each record entered. A file’s key may consist of any number of key fields, ranging from a single field up to every field in the file.
The minimum total key size is 1 bytes and the maximum is 195 bytes.
The minimum total record size is 4 bytes; the maximum safe record size is 32767 bytes. The effects of a larger record size are undefined and platform-dependent.
Keys are sorted by direct byte comparison. It is therefore recommended that, unless sequential access order is unimportant, floating point types (r4, r8 and n8.d) are not used for key fields; nor should numeric fields be used if they may contain negative values. Subscripted (array) fields may not be used as key fields. See Field types and sizes.
The remaining fields in the file are known as data fields. These do not form part of the key, and access to records using these fields as search data is not possible unless secondary indexes are created.
All the key fields must appear at the beginning of the data dictionary, before the data fields.
In summary, the following points should be considered when constructing a key:
1 |
The first key field is of the greatest importance, determining the main order in which the file will be sorted. The field chosen should not normally be one in which it is expected that records may contain a null value. |
2 |
If each record is absolutely guaranteed to have a unique value in the first key field, no further key fields are necessary. |
3 |
The second key field determines the order in which records with an identical value for the first key field are sorted. Searching by this key field will be slower than by the first key field. |
4 |
Again, if each record can now be guaranteed a unique value for the concatenated key fields, no further key fields are necessary. Otherwise select a third key field, and so on. |
5 |
Using unnecessary fields in the key may hamper efficiency and waste disk space. However, if every field is included in the key, disk space can be saved, since there is no need to store a data file in addition to the index file. |
6 |
Although the main file index can only be searched via key fields, the facility to use a field as a search field is not a good reason to include it in the key. Searches by any key other than the first key are inefficient. Instead, create a secondary index keyed by the field required. |
RELATED TOPICS |