data.management Introductory/Access: Modifiers

data.management

Command data.management Introductory/Access: Modifiers
Applicable release versions:
Category Access: Modifiers (8)
Description overview of the design philosophy behind Pick

Pick is specifically designed for data management. All data in Pick is stored in items within files. Items within these files are divided into sets called attributes which contain one, multiple, or no values.

Subsetting the Pick model into the tabulating model (punch cards, batch processing); items would be called records, attributes would be called fields, and item-ids would be called record keys. Items, attributes, and values can be of variable length. Items and attributes can contain multiple values or sets of values and are separated by specific delimiters. Items are delimited by segment marks (hex FF). Attributes are delimited by attribute marks (hex FE). Multiple values are delimited by value marks (hex FD).
Because delimiters are used, the length of values within attributes doesn't have to be more than the actual physical length of the data. This gives Pick a distinct advantage over traditional systems in that items are of variable rather than fixed length. This not only improves efficiency since disk access time is reduced because of the smaller items, but the amount of disk space required to support the data base is substantially reduced.

Traditional
EDP Terminology Pick Terminology

Disk Physical Level
Partition Partition
Cluster Frame
Block Block
File System
Directory Dictionary
File File
Record (Primary key) Item (Item-id)
Field (Alternate key) Attribute (Index-key)
Values Values

Items that are stored in a file may be accessed in several ways, including:
Directly, using the item-id as the key,
Sequentially in the hashing sequence,
Sequentially in a sorted sequence,
By index key.

The direct file access technique, using the item-id to locate the item within the file, is an efficient method of locating data and lends itself to the on-line nature of the Pick System. The system overhead required to access an item using direct file access is independent of the actual file size. Items may also be indexed by attributes and accessed by index key.

Pick dictionary defining items allow both simple and complex conversions and correlatives to be defined using the a-type processing code, among others. The instructions for the formation of index keys are specified using these same a-processing codes. As many indexes as necessary may be created for a file. The root fid (frame id) and a processing code for each index are stored as a single value in attribute 8 (correlative) of the file-defining item.

Indexes are created by the TCL verb, create-index, and stored on the disk in ASCII sequence within a B-tree (balanced-tree) structure. Whenever the file is changed, the indexes, if affected, are also changed automatically by the system.

For example, if an item is deleted, all index entries that point to that item are also deleted and changed to point to the next item. Indexes can be used in Pick/BASIC programs through the key and root statements. Indexes are also used by verbs such as list, sselect, and sort, if there is an index corresponding to the sort keys.

Files can be updated through the bridge correlative processing codes, by Pick/BASIC programs, and through the Update processor (UP).

Files are stored on disk in blocks called frames. The frames are uniform in size (1024 or 2048 bytes). The primary file space physically consists of a contiguous set of disk frames. The beginning frame is the base frame and the number of contiguous frames or groups (including the base frame) is the modulo of the file. The modulo is defined at the time the file is created or resized. The system automatically adds or removes frames to or from groups as the amount of data (number and/or size of items) within the group expands or contracts. The frames added automatically by the system are added to what is called the secondary file space. This means you do not need to redimension a file as the amount of data in that file increases or decreases.

Consider this example file which was initially allocated seven contigious frames (modulo 7) in Primary Space. As items were added to the file, they were allocated to groups through the hashing algorithm. Some groups required additional frames. These frames were allocated from Secondary File Space. Frames within a group were explicitly linked together.

Primary Secondary
File Space File Space
---------- -----------------------------------

Group 1 ------- --------
| |--------| |
------- --------
Group 2 -------
| |
-------
Group 3 ------- -------- ------- -------
| |--------| |-------| |-------| |
------- -------- ------- -------
Group 4 -------
| |
-------
Group 5 ------- --------
| |--------| |
------- --------
Group 6 -------
| |
-------
Group 7 ------- --------
| |--------| |
------- --------

Items are distributed to the various groups within a file based on a hashing algorithm that calculates the frame identification number (fid) of the first frame in the group. Items are distributed quasi-randomly between groups and sequentially within a group. This quasi-randomness is achieved by using the item-id directly in the hashing algorithm. Because of the nature of the mathematical relationship defining the hashing algorithm, modulo numbers that are multiples of 2 or 5 should not be assigned.

To enable data transfer to and from disk to occur at optimum efficiency, remember to set the modulo of the file to the nearest prime number above that required to set the number of frames per group below unity (50 to 75 percent utilization traded for single disk access speed, etc.) based on the amount of data storage anticipated. This feature makes the system more efficient in that the probability of two or more users accessing the same group at the same time is reduced because of the algorithm of data distribution between groups.

Number-of-Items X Average-size-of-Items (Bytes) / Frame-Size (Bytes)

The result should be increased to the next largest prime number.

The frame size for a particular version of Advanced Pick can be determined by executing the "what" TCL command. The number of available bytes within a frame is listed on the first line of the report under "dfsize". The actual frame size is determined by rounding the "dfsize" up to the next power of 2. The difference between "dfsize" and actual frame size is used to hold the frame linkages (forward and backward pointers).

When more than 50 percent of the groups have more than one frame, or the utilization gets below 50 percent, reallocate the files.
One brute-force method of resizing the file is:

1. Creating a new file with the desired modulo.
2. Copying all items from the old file into the new file.
3. Deleting the old file.
4. Renaming the new file to the old file name.

When resizing in this manner, you must explicitly copy the index and other data from the old file's file-defining item to the new file's file-defining item before renaming all index and subroutine calls from the old file.

Files may also be reallocated using the system’s save and restore commands. When the save and restore commands are used, the indexes are handled automatically. Prior to saving the system on magnetic media, attribute 13 of the file dictionary file-defining item may be set to the new modulo for that file. When the system is restored, all files will be reallocated according to the new modulo specified in attribute 13. When attribute 13 is not specified, the file is restored exactly as it was saved. The save and restore process allows reallocation of many files at one time.

Note that items within files are saved to tape in group sequence. Because restoring to a new modulo redistributes items within the file space, the number of disk reads is considerably increased during a restore to a new modulo, slowing this process noticeably. F-resize is a program provided with the system to automatically calculate new modulos and mark attribute 13 of the files appropriately using the current statistics in the file-of-files as modified by the last file-save.

Specific TCL verbs (system-level commands) exist to manage files as listed below. Refer to the section "Terminal Control Language" for an overview of the terminal control language. Refer to the separate entries for each verb in the body of the Advanced Pick Reference Manual for more detailed information.

clear-file create-file rename-file
copy delete-file steal-file

Dictionaries

Dictionary items are used by the Pick System to describe, define, locate, and, in general, operate on data within the files to which they point. Many operations are preprogrammed functions that process the data in associated files at the system level instead of at the program level, enhancing overall system performance. Some of these operations include the definition of relationships between and within files.

Relationships between files are expressed using bridge, index, or translate processing codes. Features such as these processing codes provide the ability to cruise and zoom through the data base. Refer to the section "Update processor" for more information about cruising, zooming, and using UP.

Relationships within files are expressed through the structure controlling attribute (attribute 4) of an attribute-defining item.
Each data file in the system has one dictionary. A dictionary may have several files associated with it. Dictionaries associated with data files contain items such as attribute-defining items, file-defining items (for the associated data files) and synonym-defining items (additional views of attributes after different processing), as well as compiled Pick/BASIC programs.
Attribute 1, the dictionary code attribute (referred to as the d/code) identifies the item type. If the attribute contains an "a", it is an attribute-defining item. If the attribute contains a "d", it is a file-defining item. If the attribute contains a "q", it is a synonym-defining item.

There are three types of dictionaries: system, master, and file.

System Dictionary - There is only one system dictionary per system. Items within the system dictionary (mds) point to account master dictionaries. The "mds" file can only be accessed on the dm account.

Master Dictionary - There is one master dictionary (md) for each account. When a new account is created, a standard set of vocabulary items are copied into the new account's md. Items within Master Dictionaries point to file dictionaries.

File Dictionaries - There are multiple, distributed file dictionaries among various accounts. Items within file dictionaries point to data files. File-defining items, synonym-defining items and attribute-defining items are also present in file dictionaries. Pointers to compiled Pick/BASIC programs are only present in file dictionaries.

Dictionaries as Operators

The Pick dictionary is a file consisting of items that contain 18 attributes. Each dictionary item can be considered as a vector operator of 18 elements, some of which contain operations to be performed on the specified attribute in the associated file. One element identifies the attribute within the associated file that is the operand.

File-defining items operate specifically on attribute 0 of the associated file. File-defining items also contain system information specific to the associated file. Attribute-defining items may operate on any other attribute within the associated data file or dictionary.

Functions (or programs) defined at the system level (or by the user) can be assigned to the appropriate attributes of the attribute-defining item allowing an operation to be defined in minutes at the system level. In many cases the generation of complex programs in a high level language can be avoided, and this vector provides a shorthand language for generating programs.

Data can be entered through UP or a Pick/BASIC program which passes data through the specified attribute-defining item within the data file dictionary for modification and stores the data in the specified attribute of the data file. When the data is viewed using UP or retrieved using Access or Pick/BASIC, it passes from the attribute in the data file through the data file dictionary for modification prior to output. Secondary files may be involved if the operation specified in the dictionary item is a translation.

Pick Dictionary Structure:

System Dictionary (mds, one per system)
|
+----------------------+
| |
----------------- -----------------
Master Dictionary Master Dictionary (one per account)
|
+---------------------+
| |
----------------- -----------------
File Dictionary File Dictionary (one per file)
| |
----------------- -----------------
Data File Data File
|
-----------------
Data File

Master Dictionary

There is one master dictionary (md) file for each account. When a new account is created, a standard set of vocabulary items are copied into the new account's master dictionary. The following types of items are contained in master dictionaries:

Attribute-defining items
File-defining items
Synonym-defining items
Macros
Menus
Verbs
Connectives
Cataloged Pick/BASIC program pointers
PROCs

File-defining items (d-pointers) and synonym-defining items (q-pointers) are two types of file pointers found in an md. File-defining items point to files within the current account. Synonym-defining items can point to files either within the current account or within other accounts.

There is generally only one file-defining item within a dictionary. Attributes 2 and 3 of file-defining items in an account md contain the base fid and modulo respectively of the file dictionary to which the item points. If the dictionary references more than one data file, attributes 2 and 3 contain base fid and modulo data respectively for the associated data file. The usage and definition of the attributes in the master dictionary file-defining item are:

Attribute 0 (Item-id) Name of the file being defined.
Attribute 1 Dictionary Code d and other options.
Attribute 2 Base frame number of the associated file.
Attribute 3 Number of contiguous frames in the primary file space of the file.
Attribute 4 Reserved and unavailable
Attribute 5 Retrieval locks
Attribute 6 Contains update locks
Attribute 7 Password required to access the md
Attribute 8 Reserved and unavailable
Attribute 9 Attribute "type" or justification
Attribute 10 Number of character spaces to be allocated for displaying the data within the attribute on Access reports
Attribute 11 Reserved and unavailable
Attribute 12 Reserved and unavailable
Attribute 13 Reallocation, used in the save and restore process to redefine the value of the modulo of the associated file.
Attribute 17 Description of the file

To change the information in the "mds" file, use the Update processor. Items can be added, deleted, or modified by executing one of the following command sequences from the dm account:

u mds account.name
create-account account.name
account-maint account.name

UP displays the contents of attributes 1 through 10 (with default values in place for new accounts). To add to the contents of attribute 13, move the cursor to the end of attribute 10, press <return> three times and add the new modulo surrounded by parentheses to resize the master dictionary in attribute 13.

Attribute-Defining Items

Attribute-defining items (ADIs) define views of attributes and processing to be applied to them when used in Update, Access, and BASIC. For a complete explanation, see the entry "attribute-defining items".

File-Defining Items

File-defining items (fdi) define files, default views of them, indexes to be maintained for the files, audit trail information, as well as some of the processing that automatically takes place when items in files are updated. For a complete explanation, see the entry "file-defining items".

Synonym-Defining Items

Synonym-defining items, also known as q-pointers, are used in account master dictionaries to point to other files. The files may be dictionaries or data files within the current or in other accounts. In general, only the first four attributes are used (attributes 0, 1, 2, and 3). For a complete explanation, see the entry "synonym-defining items".

System Security

The Pick System maintains several levels and forms of security. There are user and master dictionary passwords, system privilege levels, file retrieval and update lock codes, and restricted access to the system level. For a complete explanation, see the entry "security".
Syntax
Options
Example
Purpose
Related