basic.perform Statement/BASIC Program, basic.performance Definition/BASIC Program, perf Definition/General, unix.performance Definition/Unix, perf Definition/General

basic.perform

Command basic.perform Statement/BASIC Program
Applicable release versions: AP 6.2
Category BASIC Program (486)
Description
Syntax
Options
Example
Purpose
Related basic.execute

basic.performance

Command basic.performance Definition/BASIC Program
Applicable release versions: AP 6.0
Category BASIC Program (486)
Description discusses three methods of increasing performance: replacing dynamic arrays with dimensioned arrays, replacing common with named common, saving screen images in a variable.

Replacing dynamic arrays with dimensioned arrays is especially effective when an item is read from a file, and manipulated extensively afterwards. If the number of attributes in the item is known, the "matread" statement may be used to read the dynamic string on disk into a dimensioned array. If the size is not known, or subject to change, read the item into a dynamic array, then use "dcount" to get the number of attributes, redimension a dimensioned array to the desired size, and assign the dynamic array to the dimensioned one. For example:

equ am to char(254)
read xx from "big.item"
size = dcount(xx,am)
dim stat(size)
stat = xx

Another great speed increase can be realized by using named commons to store data that must be used between various applications. See "common" for more information on this topic. If named commons are used for this purpose, all applications involved must be compiled EITHER with the interpreter or with FlashBASIC.

If an application does large amounts of screen output, it is suggested that screens be built as large strings at the beginning of the program. This way, a screen refresh is simply a "print" of a string variable.

x=@(-1):@(1,1):"Name:":@(1,2):"Number:":@(1,3):"Address:"
print x
Syntax
Options
Example
Purpose
Related tcl.compile
tcl.term
tcl.run
basic.common
basic.debugger.down
basic.debugger.up
basic.debugger.
basic.debugger.#
flash.basic.error.logging
tcl.config.core

perf

Command perf Definition/General
Applicable release versions: AP/Unix
Category General (155)
Description describes various tips, utilities and performance monitoring tools which allow identifying possible bottlenecks in a given configuration.

Introduction

When performance problems are experienced on a system, it is necessary to distinguish problems due to the Unix environment and problems due to a configuration not adapted to the application.

The reader is assumed to have a fairly good understanding of a Pick environment and some knowledge of Unix.

Overview

Unix related performance problems are usually punctual: at one given time, the system performances degrade noticeably, but overall performance should remain satisfactory. These problems are usually fairly easy to track and to fix.

Configuration problems are more insidious, in that they appear repetitively under some circumstances. The basic principle is to monitor the activity of the system over a long period of time during normal system activity. A series of statistics are taken and stored in a log file for later analysis.

The command to monitor the activity is buffers. The command to display the log file is buffers.g.

Unix Related Bottlenecks

The first elements to look at are the results provided by sar to eliminate configuration problems due to an unexpected Unix activity alongside with the Pick activity. Device related problems may also have very visible effects on the overall performance.

SAR Results

See the section 'System Activity Reporting' in the chapter 'System Administration' in the Installation or User's Guide for more details about sar.

CPU usage:

A well balanced system should have a high percentage (above 80-90%) of user cpu usage. High system mode usage indicates too many process switches, or too many system calls. A non null waiting for IO cpu usage indicates disk bottleneck. If the system cpu usage becomes very high, without high IO activity, this may indicate a device problem (see next section).

Paging activity:

The absolute golden rule is to avoid swapping (paging) during normal operations. To avoid swapping, the physical memory must be increased, or the amount of memory allocated to Pick decreased. Surprisingly, if the system swaps, Pick performances may improve by reducing the amount of memory allocated to Pick in the configuration file. Obviously, there are some lower limits which should not be crossed. The Pick activity monitoring should allow determining how far it is possible to go on that path.

If possible, avoid using costly Unix commands during peak hours (compiling is painful, X-window requires a lot of memory, etc...).

If some significant swapping is taking place, control that the memory allocated to Pick (see the verb what) is not bigger than the total amount of physical memory minus the minimum size of memory required for the Unix Kernel (from 2 megabytes for SCO Unix to 6 megabytes for AIX, depending on the Implementation).

To identify which processes are running, do the following (as 'root'):

ps -edalf | grep R

S UID PID PPID STIME TTY TIME CMD
R root 4719 1 ... 07:08:53 24/0 0:05 ap - 24 tty24
R root 8999 10534 ... 07:58:33 89/0 0:00 ps -edalf
S root 10534 4133 ... 08:58:33 89/0 0:00 grep R
R demo 26242 25467 ... 07:10:03 75/0 0:16 demo

The above example shows an extract of the result. This shows that the process 4719 runs Pick on the PIB 24. The process 26242 is a non Pick process which has used three times as much CPU as the Pick process did. By running this command several times, if some processes show several times, it will be possible to identify processes that may be should not be running during peak hours.

Device Problems

The most common problems with TTYs are due to incorrect cabling. When Unix tries to spawn a process (Pick or Unix) attached to a terminal, the device must be ready. If not, Unix 'waits' a bit and tries again. Worse, a port with a DCD in an unstable state can generate many interrupts, which, in turn, generate 'hang up' signals, creating a very important system load. To identify such problem, do the following (as 'root'):

ps -edalf | grep '?'

S root 4184 9047 ... 09:06:26 89/0 0:00 grep ?
S root 25185 1 ... 07:08:52 ? 0:00 ap - 9 tty9
R root 30571 1 ... 07:08:52 ? 23:45 ap - 19 tty19 printer

This command shows the process attached to terminals the system could not open. In the above example,the second line shows a Pick process (pid=25185), in a sleeping state (S): this process does not consume any CPU. The system could not open the terminal /dev/tty9, but the system abandoned tyring to open it. The third line shows a Pick process (pid=30571), in a running state (R): this terminal does use CPU, as the CPU usage '23:45' shows. The system tried to open the device /dev/tty19, failed, as in the first case, but, probably, the cable is incorrect or hanging loose at the other end, and is generating constant signals.

To fix this situation, the terminal must be connected properly or the associated entry in /etc/inittab turned to off instead of respawn. Unfortunately, it is sometimes very difficult to identify which device is in trouble when the above command does not show it explicitly. Only careful checking of the cables or trying to find which ports which did not start as expected, will allow, by elimination, to find the faulty port.

Identifying Configuration Problems

Statistics

The following elements are monitored by the buffers command:


Name Description

Activ Number of Process activations. Each disk read, keystroke, process wake up after a sleep increments this counter. When the number of frame faults is subtracted from this counter, this gives an idea of the volume of data entry.

Idle Idle time. Not supported on Unix Implementations

Fflt Frame faults. This counts the number of disk reads.

Writes Disk Writes. All writes are normally done by the background flush process to update disk from dirty frames in memory. A high number indicates either a lot of updates, but also may be an insufficient memory allocated for the Pick virtual machine.

Bfail Buffer Search Failures. This counters counts the number of failures to allocate a buffer in memory for a new frame. When non zero, this indicates that the memory is insufficient. This counter should never be non zero.

RqFull Disk Read Queue Full. Not supported on Unix Implementations

WqFull Disk Write Queue Full. This counter counts the number of instances where the flusher cannot keep up with the dirtying of frames. This is an indication that either the write queue is too small for the given configuration (see the section 'Flusher Adjustments' later in this appendix) or that the memory is too small.

DskErr Disk Errors.

Elapsd Elapsed time. This is the time in seconds between two sampling. For internal use only.

DblSrc Double Search. This counts the number of collisions between two or more processes frame faulting on the same frame at the same instant. A non zero counter should be exceptional.

Breuse Buffer Re-Use. This counts the number of instances where a memory buffer has been allocated by one process to read one FID and another process allocated the same buffer to contain another FID. A non zero counter should be exceptional.

Bcolls Batch Contentions/Collisions. This counts the number of collisions between a 'batch' process (i.e., a process which is disk intensive) and an 'interactive' process (i.e., a process which is keyboard input intensive). By default, Pick insures that interactive processes are given priority over batch processes in accessing certain resources. See the section 'Batch Processes' in this appendix for more details.

Sem Semaphores Collisions. This counts the number of collisions between two processes trying to access a systemwide internal table.

Vlocks Virtual Locks Failures. This counts the number of cases when a Pick process tried to assert a virtual lock and failed to acquire it because another process had it.

Blocks FlashBASIC or Pick/BASIC Locks Failures. This counts the number of cases when a Pick process tried to assert a FlashBASIC or Pick/BASIC lock and failed to acquire it because another process had it.

B0reg Buffers with no Virtual Registers attached. These are the buffers not currently attached for immediate reference. At any given time, very few buffers are actually attached. It is therefore normal that this number be almost equal to the total buffers in memory.

B1reg Buffers used by more than one process, but not used by its owner any more. These should be in very small number.

B2reg Buffers used exclusively by their owner. On RISC implementations, this situation allows better performance, because there is no conflict on these buffers. Normally, these buffers contain private workspace, data which is not shared, etc...

B>3reg Buffers used both by their owner and other processes. This number represent the number of pages actually shared among processes (data files) at any given time.

ww Write Required. This counts the number of buffers currently modified and not yet written to disk.

IObusy Buffers being read from disk. This counts the number of pending disk reads. This counters is usually null, since the reads are too fast to be picked up.

Mlock Number of buffers memory locked. If the ABS section is locked, this number is at least equal to the ABS size. Also included, are the tape buffers when the tape is attached.

Ref Referenced Buffers. This counts the number of buffers which have been recently used.

WQ Write Queued. Number of buffers currently enqueued for write.

Tophsh Top of Hash. This number measures the quality of the hashing algorithm used to find a frame in memory. This number must be high (above 60% of the total buffers).

avail Available buffers. Number of buffers candidate for replacement. These are the buffers that nobody has been using recently. When this number drops below 10% of the total buffers, performance decreases significantly.

batch Batch Buffers. This is the Number of buffers used by batch processes. A high level (something approaching 50% of disk buffers) indicates that disk intensive activity is taking place by batch processes.


Activity Log File

The activity log is stored in the file buffers.log with a data level per weekday (buffer.log,Monday, buffer.log,Tuesday, etc... ). The file is created automatically when the buffers (H) command is used for the first time. Each data level is cleared when changing day, so that the file records a whole week of activity automatically. The itemid is the internal time on five digits.

The buffers command also creates automatically the dictionary attributes corresponding to the various counters, as shown in the table above. The attribute TIME displays the sampling time.

The attribute DESCRIPTION in the D pointers Monday, Tuesday etc... contains the date.

The file is created with a DX attribute.

Monitoring Activity

Logon to the dm account. Type:

buffers {(options}

options


C Clear todays log data level, when used with the (H) option. This option must be used the very first time. To restart the monitoring after having stopped it for a while, do not use the (C) option.

H{n} Record statistics in the log file. If followed by a number n, the process sleeps n seconds between each sample. The default value is 5 seconds. When sampling over long periods, 5 minutes (300 seconds) are a good compromise between accuracy and volume of data.

L{n} Loop sampling and displaying statistics. If followed by a number n, the process sleeps n seconds between each sample. The default value is 5 seconds.

S Display system counters. Without this option, a simplified set of counters is displayed. All counters are always recorded, even without this option.


Examples:

buffers

Take one sample of the non-system statistics.

buffers (sh300c

Loop displaying all counters, recording history and sampling every 300 seconds (5mn). The log file data level corresponding to today is cleared, thus starting a new session.

When looping, buffers polls the keyboard to detect the key "x" to stop or "r" to redraw the screen if it has been disturbed by a message, for instance. Any other key forces buffers to take another sample.

Displaying Log File

Raw display

The history file can be displayed by any access sentence. For example:

sort buffers.log,friday with time >= "11:14:00"

Histograms

The buffers.g command lists the log file as a series of histograms. The syntax is:

buffers.g cntr [day{-{day}}|*] {step {strt.time-{end.time}}} {(option}

cntr Statistic counter name (eg. fflt for the 3rd counter). Must be among the list shown in the table above. If the counter specified is relative to the buffers, percentages of the total buffers are displayed, rather than raw figures.

day Day{s} to list. The day can be one day, expressed either explicitly (monday, tuesday, etc...) or a number from 1 (Sunday) to 7 (saturday). A range of days can be specified by specifying two days separated by a dash (-). If the second day is omitted, Saturday is assumed. The whole week can be listed by using an asterisk (*).

step Specifies the display time step as HH:MM{:SS}. All samples taken within the step are accumulated and averaged. If step is not specified or if the step is 0, or if the step is smaller than the sampling period in the log file, all samples are displayed.

strt.time Starting time. If no starting time is specified, 00:00:00 is assumed.

end.time Ending time. If no ending time is specified, 23:59:59 is assumed.


Options


P Direct output to printer.

Examples:

buffers.g fflt * 01:00:00

List the number of frames faults (disk reads), for the whole week, by step of one hour. In the example below, no history was recorded before Wednesday.

No log for Sunday

No log for Monday

No log for Tuesday

20Feb1991; Wednesday; Ctr=fflt, Step=01:00:00, Range=00:00:00-23:59:59

0 8848 17696 26544 35392 44240 53088 61936
+------+------+------+------+------+------+------+------+----
10:59:28 *************************
11:59:54 ***********************************************************
13:00:25 **********************************************************
14:00:52 ************************************
15:01:18 ***************************
16:01:49 ********************************************************
17:02:22 ***************************************
18:02:55 ******
19:03:32 ***********************************************
20:04:08 *************************************************
21:04:43
22:05:21 ***************************************************
23:05:55 *************

Number of samples : 155
Total : 622070
Average per period : 7.1999 / sec.
Max value : 88481
Peak time : 13:00:25

buffers.g ww monday-friday 00:30 08:00-17:30 (p

List the percentage of write required write required buffers, for the week days only, during business hours, by steps of 30 minutes.


Interpreting Results

After taking a significant sample, list the results with the buffers.g command . The most useful parameters to survey are:


Fflt This measures the number of frame faults. If this number approaches the disk bandwidth as determined by the manufacturer, the system becomes disk bound. Solutions range from increasing the memory allocated to Pick, to changing disks, or reorganizing the Pick data base on separate disks to increase parallelism.

Writes This number should stay about one third to a half of the number of frame faults. It is not 'normal' for a system to do more writes than it reads, under normal operation. If this is not the case, see the section 'Flusher Adjustment' in this article.

Bfail This number should never be non zero. If it is not the case, the memory allocated to Pick is definitely too small.

WqFull This number should not be non-zero 'too often'. If it is the case, and if the number of writes is too big also, there is an abnormal rate of writes. See the section 'Flusher Adjustment' in this article.

Bcolls If this number becomes too high, this indicates that a lot of batch jobs (like selects of big files) are done while other processes are doing data entry. It is also an indicator that indeed interactive jobs are receiving higher priority than batch processes. See the section 'Interactive - Batch Processes' below.

ww This number should never go above 50 % of the whole buffer pool. If this is the case, the flusher is probably not activated often enough. See the section 'Flushed Adjustment' below.

avail This number should never go below 10% of the whole buffer pool. If this is the case, memory must be increased or the flusher must be adjusted.


Flusher Adjustment

The flusher is a background process, started automatically at boot time, which scans the Pick memory and writes back to disk frames which have been modified. It is an important task, not only to ensure that data gets back on disk, but also to make room for new data. Usually, a process reads data, modifies it, but may not need it for a 'long' time. The flusher takes care of writing the data back on disk so that the memory can be reused to read in other data.

This 'cleaning' of the memory is done:


- Periodically, when the disk is not active. If the disk becomes inactive 'for some time', the flusher wakes up and scans the memory writing back all it can unless another a process requires a disk access. This period is defined by the flush statement in the configuration file.

- On demand. When the memory gets 'full', i.e., when a lot of pages in memory have to be written back to disk, the flusher wakes up immediately.


The more often the flusher gets awakened, the more often memory is written back to disk. But this creates disk activity, thus decreasing the disk channel bandwidth available for 'useful' work, and CPU activity, therefore adding system load. Another catch to a high frequency flush is that data which is being modified (workspace, select lists, etc...) may be written several times on disk when only the last time would have been necessary.

The verb set-flush allows changing the flush period (see the section 'TCL commands' in this document. Increase this period, checking with buffers that the 'write queue full' events remains low and that the number of available buffers does not drop too low. Normally, the system is self regulating, increasing the flush frequency in case of high memory usage, so there is no need for a low flush period. 30 seconds should be a high limit.

The configuration file also contains the statement dwqnum which defines the length of the internal write queue. Increasing this queue reduces the probability of the situation in which the flusher awakened on critical demand, thus reducing the number of flushes. The down side to increasing the write queue size is that the flusher works by 'bursts', which may overload the disk channel when this phenomenon occurs. This parameter cannot be changed dynamically, which makes a bit more difficult to monitor.

Interactive - Batch Processes

Pick user processes are divided into two classes, depending on the type of activity they have: interactive processes are processes which typically do keyboard inputs 'frequently'; a batch process is a process which has little keyboard activity, require a lot of disk i/o, and/or is CPU intensive.>The system automatically discerns which type of process is running based on internal statistics.

The System Adminstrator can bias and/or override the default parameters used by the prioritization mechanism. Though not recommended, one can even force any processes, regardless its process activity, to be seen by the system as "interactive", for example. This can be changed dynamically on a per process basis via the set-batch command Also, the TCL command set-batchdly allows the displaying and setting of global values used in the queueing of certain types of process activity.
Syntax
Options
Example
Purpose
Related tcl.what
tcl.set-batchdly
tcl.set-batch
tcl.syschk
tcl.buffers
unix.performance

unix.performance

Command unix.performance Definition/Unix
Applicable release versions:
Category Unix (24)
Description
Syntax
Options
Example
Purpose
Related perf

perf

Command perf Definition/General
Applicable release versions: AP/Unix
Category General (155)
Description describes various tips, utilities and performance monitoring tools which allow identifying possible bottlenecks in a given configuration.

Introduction

When performance problems are experienced on a system, it is necessary to distinguish problems due to the Unix environment and problems due to a configuration not adapted to the application.

The reader is assumed to have a fairly good understanding of a Pick environment and some knowledge of Unix.

Overview

Unix related performance problems are usually punctual: at one given time, the system performances degrade noticeably, but overall performance should remain satisfactory. These problems are usually fairly easy to track and to fix.

Configuration problems are more insidious, in that they appear repetitively under some circumstances. The basic principle is to monitor the activity of the system over a long period of time during normal system activity. A series of statistics are taken and stored in a log file for later analysis.

The command to monitor the activity is buffers. The command to display the log file is buffers.g.

Unix Related Bottlenecks

The first elements to look at are the results provided by sar to eliminate configuration problems due to an unexpected Unix activity alongside with the Pick activity. Device related problems may also have very visible effects on the overall performance.

SAR Results

See the section 'System Activity Reporting' in the chapter 'System Administration' in the Installation or User's Guide for more details about sar.

CPU usage:

A well balanced system should have a high percentage (above 80-90%) of user cpu usage. High system mode usage indicates too many process switches, or too many system calls. A non null waiting for IO cpu usage indicates disk bottleneck. If the system cpu usage becomes very high, without high IO activity, this may indicate a device problem (see next section).

Paging activity:

The absolute golden rule is to avoid swapping (paging) during normal operations. To avoid swapping, the physical memory must be increased, or the amount of memory allocated to Pick decreased. Surprisingly, if the system swaps, Pick performances may improve by reducing the amount of memory allocated to Pick in the configuration file. Obviously, there are some lower limits which should not be crossed. The Pick activity monitoring should allow determining how far it is possible to go on that path.

If possible, avoid using costly Unix commands during peak hours (compiling is painful, X-window requires a lot of memory, etc...).

If some significant swapping is taking place, control that the memory allocated to Pick (see the verb what) is not bigger than the total amount of physical memory minus the minimum size of memory required for the Unix Kernel (from 2 megabytes for SCO Unix to 6 megabytes for AIX, depending on the Implementation).

To identify which processes are running, do the following (as 'root'):

ps -edalf | grep R

S UID PID PPID STIME TTY TIME CMD
R root 4719 1 ... 07:08:53 24/0 0:05 ap - 24 tty24
R root 8999 10534 ... 07:58:33 89/0 0:00 ps -edalf
S root 10534 4133 ... 08:58:33 89/0 0:00 grep R
R demo 26242 25467 ... 07:10:03 75/0 0:16 demo

The above example shows an extract of the result. This shows that the process 4719 runs Pick on the PIB 24. The process 26242 is a non Pick process which has used three times as much CPU as the Pick process did. By running this command several times, if some processes show several times, it will be possible to identify processes that may be should not be running during peak hours.

Device Problems

The most common problems with TTYs are due to incorrect cabling. When Unix tries to spawn a process (Pick or Unix) attached to a terminal, the device must be ready. If not, Unix 'waits' a bit and tries again. Worse, a port with a DCD in an unstable state can generate many interrupts, which, in turn, generate 'hang up' signals, creating a very important system load. To identify such problem, do the following (as 'root'):

ps -edalf | grep '?'

S root 4184 9047 ... 09:06:26 89/0 0:00 grep ?
S root 25185 1 ... 07:08:52 ? 0:00 ap - 9 tty9
R root 30571 1 ... 07:08:52 ? 23:45 ap - 19 tty19 printer

This command shows the process attached to terminals the system could not open. In the above example,the second line shows a Pick process (pid=25185), in a sleeping state (S): this process does not consume any CPU. The system could not open the terminal /dev/tty9, but the system abandoned tyring to open it. The third line shows a Pick process (pid=30571), in a running state (R): this terminal does use CPU, as the CPU usage '23:45' shows. The system tried to open the device /dev/tty19, failed, as in the first case, but, probably, the cable is incorrect or hanging loose at the other end, and is generating constant signals.

To fix this situation, the terminal must be connected properly or the associated entry in /etc/inittab turned to off instead of respawn. Unfortunately, it is sometimes very difficult to identify which device is in trouble when the above command does not show it explicitly. Only careful checking of the cables or trying to find which ports which did not start as expected, will allow, by elimination, to find the faulty port.

Identifying Configuration Problems

Statistics

The following elements are monitored by the buffers command:


Name Description

Activ Number of Process activations. Each disk read, keystroke, process wake up after a sleep increments this counter. When the number of frame faults is subtracted from this counter, this gives an idea of the volume of data entry.

Idle Idle time. Not supported on Unix Implementations

Fflt Frame faults. This counts the number of disk reads.

Writes Disk Writes. All writes are normally done by the background flush process to update disk from dirty frames in memory. A high number indicates either a lot of updates, but also may be an insufficient memory allocated for the Pick virtual machine.

Bfail Buffer Search Failures. This counters counts the number of failures to allocate a buffer in memory for a new frame. When non zero, this indicates that the memory is insufficient. This counter should never be non zero.

RqFull Disk Read Queue Full. Not supported on Unix Implementations

WqFull Disk Write Queue Full. This counter counts the number of instances where the flusher cannot keep up with the dirtying of frames. This is an indication that either the write queue is too small for the given configuration (see the section 'Flusher Adjustments' later in this appendix) or that the memory is too small.

DskErr Disk Errors.

Elapsd Elapsed time. This is the time in seconds between two sampling. For internal use only.

DblSrc Double Search. This counts the number of collisions between two or more processes frame faulting on the same frame at the same instant. A non zero counter should be exceptional.

Breuse Buffer Re-Use. This counts the number of instances where a memory buffer has been allocated by one process to read one FID and another process allocated the same buffer to contain another FID. A non zero counter should be exceptional.

Bcolls Batch Contentions/Collisions. This counts the number of collisions between a 'batch' process (i.e., a process which is disk intensive) and an 'interactive' process (i.e., a process which is keyboard input intensive). By default, Pick insures that interactive processes are given priority over batch processes in accessing certain resources. See the section 'Batch Processes' in this appendix for more details.

Sem Semaphores Collisions. This counts the number of collisions between two processes trying to access a systemwide internal table.

Vlocks Virtual Locks Failures. This counts the number of cases when a Pick process tried to assert a virtual lock and failed to acquire it because another process had it.

Blocks FlashBASIC or Pick/BASIC Locks Failures. This counts the number of cases when a Pick process tried to assert a FlashBASIC or Pick/BASIC lock and failed to acquire it because another process had it.

B0reg Buffers with no Virtual Registers attached. These are the buffers not currently attached for immediate reference. At any given time, very few buffers are actually attached. It is therefore normal that this number be almost equal to the total buffers in memory.

B1reg Buffers used by more than one process, but not used by its owner any more. These should be in very small number.

B2reg Buffers used exclusively by their owner. On RISC implementations, this situation allows better performance, because there is no conflict on these buffers. Normally, these buffers contain private workspace, data which is not shared, etc...

B>3reg Buffers used both by their owner and other processes. This number represent the number of pages actually shared among processes (data files) at any given time.

ww Write Required. This counts the number of buffers currently modified and not yet written to disk.

IObusy Buffers being read from disk. This counts the number of pending disk reads. This counters is usually null, since the reads are too fast to be picked up.

Mlock Number of buffers memory locked. If the ABS section is locked, this number is at least equal to the ABS size. Also included, are the tape buffers when the tape is attached.

Ref Referenced Buffers. This counts the number of buffers which have been recently used.

WQ Write Queued. Number of buffers currently enqueued for write.

Tophsh Top of Hash. This number measures the quality of the hashing algorithm used to find a frame in memory. This number must be high (above 60% of the total buffers).

avail Available buffers. Number of buffers candidate for replacement. These are the buffers that nobody has been using recently. When this number drops below 10% of the total buffers, performance decreases significantly.

batch Batch Buffers. This is the Number of buffers used by batch processes. A high level (something approaching 50% of disk buffers) indicates that disk intensive activity is taking place by batch processes.


Activity Log File

The activity log is stored in the file buffers.log with a data level per weekday (buffer.log,Monday, buffer.log,Tuesday, etc... ). The file is created automatically when the buffers (H) command is used for the first time. Each data level is cleared when changing day, so that the file records a whole week of activity automatically. The itemid is the internal time on five digits.

The buffers command also creates automatically the dictionary attributes corresponding to the various counters, as shown in the table above. The attribute TIME displays the sampling time.

The attribute DESCRIPTION in the D pointers Monday, Tuesday etc... contains the date.

The file is created with a DX attribute.

Monitoring Activity

Logon to the dm account. Type:

buffers {(options}

options


C Clear todays log data level, when used with the (H) option. This option must be used the very first time. To restart the monitoring after having stopped it for a while, do not use the (C) option.

H{n} Record statistics in the log file. If followed by a number n, the process sleeps n seconds between each sample. The default value is 5 seconds. When sampling over long periods, 5 minutes (300 seconds) are a good compromise between accuracy and volume of data.

L{n} Loop sampling and displaying statistics. If followed by a number n, the process sleeps n seconds between each sample. The default value is 5 seconds.

S Display system counters. Without this option, a simplified set of counters is displayed. All counters are always recorded, even without this option.


Examples:

buffers

Take one sample of the non-system statistics.

buffers (sh300c

Loop displaying all counters, recording history and sampling every 300 seconds (5mn). The log file data level corresponding to today is cleared, thus starting a new session.

When looping, buffers polls the keyboard to detect the key "x" to stop or "r" to redraw the screen if it has been disturbed by a message, for instance. Any other key forces buffers to take another sample.

Displaying Log File

Raw display

The history file can be displayed by any access sentence. For example:

sort buffers.log,friday with time >= "11:14:00"

Histograms

The buffers.g command lists the log file as a series of histograms. The syntax is:

buffers.g cntr [day{-{day}}|*] {step {strt.time-{end.time}}} {(option}

cntr Statistic counter name (eg. fflt for the 3rd counter). Must be among the list shown in the table above. If the counter specified is relative to the buffers, percentages of the total buffers are displayed, rather than raw figures.

day Day{s} to list. The day can be one day, expressed either explicitly (monday, tuesday, etc...) or a number from 1 (Sunday) to 7 (saturday). A range of days can be specified by specifying two days separated by a dash (-). If the second day is omitted, Saturday is assumed. The whole week can be listed by using an asterisk (*).

step Specifies the display time step as HH:MM{:SS}. All samples taken within the step are accumulated and averaged. If step is not specified or if the step is 0, or if the step is smaller than the sampling period in the log file, all samples are displayed.

strt.time Starting time. If no starting time is specified, 00:00:00 is assumed.

end.time Ending time. If no ending time is specified, 23:59:59 is assumed.


Options


P Direct output to printer.

Examples:

buffers.g fflt * 01:00:00

List the number of frames faults (disk reads), for the whole week, by step of one hour. In the example below, no history was recorded before Wednesday.

No log for Sunday

No log for Monday

No log for Tuesday

20Feb1991; Wednesday; Ctr=fflt, Step=01:00:00, Range=00:00:00-23:59:59

0 8848 17696 26544 35392 44240 53088 61936
+------+------+------+------+------+------+------+------+----
10:59:28 *************************
11:59:54 ***********************************************************
13:00:25 **********************************************************
14:00:52 ************************************
15:01:18 ***************************
16:01:49 ********************************************************
17:02:22 ***************************************
18:02:55 ******
19:03:32 ***********************************************
20:04:08 *************************************************
21:04:43
22:05:21 ***************************************************
23:05:55 *************

Number of samples : 155
Total : 622070
Average per period : 7.1999 / sec.
Max value : 88481
Peak time : 13:00:25

buffers.g ww monday-friday 00:30 08:00-17:30 (p

List the percentage of write required write required buffers, for the week days only, during business hours, by steps of 30 minutes.


Interpreting Results

After taking a significant sample, list the results with the buffers.g command . The most useful parameters to survey are:


Fflt This measures the number of frame faults. If this number approaches the disk bandwidth as determined by the manufacturer, the system becomes disk bound. Solutions range from increasing the memory allocated to Pick, to changing disks, or reorganizing the Pick data base on separate disks to increase parallelism.

Writes This number should stay about one third to a half of the number of frame faults. It is not 'normal' for a system to do more writes than it reads, under normal operation. If this is not the case, see the section 'Flusher Adjustment' in this article.

Bfail This number should never be non zero. If it is not the case, the memory allocated to Pick is definitely too small.

WqFull This number should not be non-zero 'too often'. If it is the case, and if the number of writes is too big also, there is an abnormal rate of writes. See the section 'Flusher Adjustment' in this article.

Bcolls If this number becomes too high, this indicates that a lot of batch jobs (like selects of big files) are done while other processes are doing data entry. It is also an indicator that indeed interactive jobs are receiving higher priority than batch processes. See the section 'Interactive - Batch Processes' below.

ww This number should never go above 50 % of the whole buffer pool. If this is the case, the flusher is probably not activated often enough. See the section 'Flushed Adjustment' below.

avail This number should never go below 10% of the whole buffer pool. If this is the case, memory must be increased or the flusher must be adjusted.


Flusher Adjustment

The flusher is a background process, started automatically at boot time, which scans the Pick memory and writes back to disk frames which have been modified. It is an important task, not only to ensure that data gets back on disk, but also to make room for new data. Usually, a process reads data, modifies it, but may not need it for a 'long' time. The flusher takes care of writing the data back on disk so that the memory can be reused to read in other data.

This 'cleaning' of the memory is done:


- Periodically, when the disk is not active. If the disk becomes inactive 'for some time', the flusher wakes up and scans the memory writing back all it can unless another a process requires a disk access. This period is defined by the flush statement in the configuration file.

- On demand. When the memory gets 'full', i.e., when a lot of pages in memory have to be written back to disk, the flusher wakes up immediately.


The more often the flusher gets awakened, the more often memory is written back to disk. But this creates disk activity, thus decreasing the disk channel bandwidth available for 'useful' work, and CPU activity, therefore adding system load. Another catch to a high frequency flush is that data which is being modified (workspace, select lists, etc...) may be written several times on disk when only the last time would have been necessary.

The verb set-flush allows changing the flush period (see the section 'TCL commands' in this document. Increase this period, checking with buffers that the 'write queue full' events remains low and that the number of available buffers does not drop too low. Normally, the system is self regulating, increasing the flush frequency in case of high memory usage, so there is no need for a low flush period. 30 seconds should be a high limit.

The configuration file also contains the statement dwqnum which defines the length of the internal write queue. Increasing this queue reduces the probability of the situation in which the flusher awakened on critical demand, thus reducing the number of flushes. The down side to increasing the write queue size is that the flusher works by 'bursts', which may overload the disk channel when this phenomenon occurs. This parameter cannot be changed dynamically, which makes a bit more difficult to monitor.

Interactive - Batch Processes

Pick user processes are divided into two classes, depending on the type of activity they have: interactive processes are processes which typically do keyboard inputs 'frequently'; a batch process is a process which has little keyboard activity, require a lot of disk i/o, and/or is CPU intensive.>The system automatically discerns which type of process is running based on internal statistics.

The System Adminstrator can bias and/or override the default parameters used by the prioritization mechanism. Though not recommended, one can even force any processes, regardless its process activity, to be seen by the system as "interactive", for example. This can be changed dynamically on a per process basis via the set-batch command Also, the TCL command set-batchdly allows the displaying and setting of global values used in the queueing of certain types of process activity.
Syntax
Options
Example
Purpose
Related tcl.what
tcl.set-batchdly
tcl.set-batch
tcl.syschk
tcl.buffers
unix.performance