Chapter 6


Application

This chapter describes the NQS application procedure.

6.1 ACTIVATING NQS

NQS is activated by executing the NQS daemon (/usr/lib/nqs/nqsdaemon), as shown in the following example. You can activate NQS by first registering the NQS start-up procedure in the /etc/rc.local file to be executed when starting up this system, or the NQS manager can activate the NQS daemon directly. However, if the standard NQS daemon output is not redirected to a file, the NQS daemon output information is displayed on the console or terminal. The NQS daemon can only be invoked by the superuser.

Example:

When NQS starts normally, it retains all previous queue states. All queues can be stopped at NQS start-up.

Example:

6.2 TERMINATING NQS

NQS is automatically terminated by the UNIX shutdown process. However it can also be terminated without shutting down by using the shutdown subcommand of the qmgr(1M) command, as shown next.

Example:

Home


6.3 MANAGING QUEUE APPLICATIONS

Requests are not automatically registered or executed by creating an NQS queue. The status of a queue determines whether a request can be registered or executed. Queue status consists of two properties. The first property indicates whether the queue accepts requests. The second property determines if requests are executed.

The following describes the Property 1 states.

The following are the Property 2 states.

6.3.1 Starting/Terminating Queue Applications

The queue application starts at the same time as the NQS daemon start-up (when NQS begins to operate), provided that the status allows requests to be executed. In other cases, the queue application starts when the status is changed after the NQS starts up. The application terminates when NQS stops or when the status of the queue is changed, and executing requests are forcibly terminated when NQS stops. However, when the queue status changes, no new requests are executed, but executing requests are not terminated until the execution is completed.

The next section explains the change of queue status in starting or terminating the respective applications. This section explains the starting and terminating of all queue applications. As explained earlier, all queue applications can be terminated by stopping NQS. To stop all queue applications without terminating NQS, use the stop all queue subcommand of qmgr(1M). The start all queue subcommand is used for starting all queue applications. However, the application of this queue pertains to the second property which determines if requests are to be executed, while the acceptance of requests depends on the status of the queue at that time.

Home


6.3.2 Changing the Queue Status

This section explains how to alter the queue status by changing the first and second property.

6.3.3 Aborting a Queue

Aborting a queue means that all requests executing in the queue are forcibly terminated. This can be done with the abort queue subcommand of the qmgr(1M) command. All requests to be aborted are removed. In the following, all requests executing in the batch1 queue are deleted.

Example:

6.3.4 Purging a Queue

Purging a queue deletes all requests currently in the queuing state in the queue (as opposed to aborting a queue that deletes all aborted requests). Use the purge queue subcommand of the qmgr(1M) command to purge a queue. In the following, all requests currently in the queuing state in the batch1 queue are deleted.

Example:

Home


6.4 MANAGING DEVICE APPLICATIONS

Device functions are not automatically activated by creating the device. The status of a device determines whether the device may or may not be used. The status of a device consists of two properties. The first property indicates whether the device accepts requests to process. The second property determines if the device is currently busy.

The following are the Property 1 states.

The following are the Property 2 states.

6.4.1 Changing the Device Status

This section explains how to change the device status. Only the first property may be changed.

Use the enable device subcommand of the qmgr(1M) command to change the status of the device such that it can accept requests. In the following, the dev1 device status is changed so that it can accept requests. This means that it is in the ENABLED state.

Example:

Use the disable device subcommand of the qmgr(1M) command to change the status of the queue so that it cannot accept requests. In the following, the dev1 device is disabled so that it cannot accept requests. This means that it is either in the DISABLED or CLOSED state, depending on the device status (idle or busy).

Example:

Home


6.5 MANAGING REQUEST APPLICATIONS

Ordinary users cannot manage therequests of others. The NQS manager can forcibly manage all requests on NQS, and must manage requests on the system to efficiently run the NQS application.

6.5.1 Deleting a Request

Aborting/purging a queue has been previously discussed as a way to delete requests. This method deletes requests in units of queues. You can also use the qdel(1) command to delete requests individually. This command is used in the same way as all other ordinary users.

6.5.2 Holding/Releasing a Request

Holding a request temporarily prevents it from being scheduled for request execution. When this process is carried out, the request is linked to the hold set of the queue (it is in the hold status). A request can be held with the hold request subcommand of the qmgr(1M) command while the release request subcommand is used to release the request. A request in the hold status cannot be released.

In the following, the request 72.host1 is in hold state.

Example:

In the following, the request 72.host1 is released from the hold state.

Example:

6.5.3 Suspending/Resuming Request Execution

Suspending a request execution temporarily interrupts (suspends) request execution. Resuming a request execution starts execution from the suspended state. When the suspend command is performed, the request is linked to the suspend set of the queue and is put into a suspended state. This is done using the qmgr(1M) subcommand suspend request; the resume request continues execution from the suspended state.

The following procedure temporarily interrupts the execution of the request 72.host1 and puts it in a suspended state.

Example:

The following procedure resumes the execution of the request 72.host1.

Example:

Home


6.5.4 Getting the Checkpoint for Restarting Requests

You can get the checkpoint of requests using the qmgr(1M) subcommand get checkpoint request. Execution of requests is not terminated when the checkpoint is retrieved. A request is restarted using the qmgr(1M) subcommand restart checkpoint request.

The following procedure gets the checkpoint of the request 72.host1.

Example:

When specifying a restart file name is permitted by the NQS manager, you can specify it by the keyword restart_file.

Example:

If the specification is omitted, a restart file is created on the current directory with the name request-name.csequence-number. If specifying is not permitted, a restart file is automatically created under the directory /usr/spool/nqs/restart.

The following procedure restarts the request 72.host1 from its checkpoint.

Example:

6.5.5 Changing Request Attributes

The request attributes are specified when the user submits a request. An NQS manager can change the attributes of all requests. Attributes of a request can be forcibly altered in accordance with the NQS application specifications.

The qalter(1) command or the modify subcommand of the qmgr(1M) command can be used to change the request attributes. See the qalter(1) command in Chapter 3, User Commands, for more information about this command. The qmgr(1M) command has numerous modify subcommands that change different attributes. Use the subcommand that corresponds to the attribute you want to change.

In the following, the request 72.host1 per-process file size limit becomes 50 Kbytes.

Example:

In the following, the per-process CPU time limit for the request 72.host1 becomes 1234 seconds.

Example:

> # qmgr
Mgr: modify request ppcpu_limit=(1234) 72.host1

Home


6.5.6 Moving a Request

Moving a request means that the request is moved from the queue where it is currently registered to another queue. There are two forms in NQS for moving a request. In the first method, you specify queue units to move all requests registered in the queue. In the second method, the requests are moved individually. Running requests and requests that are not in accord with the limits defined in the queue where they are transferred cannot be moved. These requests remain in the original queue.

In the following, all requests registered in the queue batch1 are moved to the queue called batch2. Requests that exceed the resource limits of batch2 and those that are being executed are moved. Requests are not transferred if batch2 is not accepting requests.

Example:

In the following, the request called 72.host1 is moved to the queue batch2.

Example:

6.6 CONFIRMING THE NQS STATUS

This section explains how to confirm the following.

Home


6.6.1 Confirming the Queue Status

A queue status can be confirmed with the qstat(1) or the qstatq(1) command. You can also use the show queue or show long queue subcommand of qmgr(1M). The following illustrates use of the qmgr(1M) subcommand.

Example:

Home


The following shows how to obtain more detailed information.

Example:

Home


The execution result depends on the system on which the preceding is executed.

Home


Home


Home


6.7 BATCH JOB ACCOUNT

NQS has functions for directing NQS batch job account information to a file. The following sections explain the functions related to the NQS batch job account.

6.7.1 Starting the NQS Batch Job Account Information Output

The output of NQS batch job account information does not begin with the start-up of NQS. It begins when the qmgr(1M) subcommand account on is used to initiate the output of information.

The following procedure outputs NQS batch job account information.

Example:

6.7.2 Terminating the NQS Batch Job Account Information Output

The qmgr(1M) subcommand account off is used to terminate the output of NQS batch job account information.

The following procedure terminates the output of NQS batch job account information.

Example:

6.7.3 Specifying the Output File of NQS Batch Job Account Information

NQS batch job account information is output to a specific file. The default file is /usr/adm/jobacct, but it can be changed by the set job_acct_file subcommand of qmgr(1M).

The following procedure makes the NQS output file of batch job account information /usr/acct/nqsjacct.

Example:

6.7.4 Specifying the Output File of NQS Batch Job State Transition Log

NQS batch job state transition log is output to a specific file. The default file is /usr/adm/jacct, but it can be changed by the set account_file subcommand of qmgr(1M).

The following procedure makes the NQS output file of batch job account information /usr/acct/nqsacct.

Example:

Home


6.8 DATABASE REBUILDING

To maintain compatibility under SUPER-UX between databases created with the upgraded version of the NQS and databases created with previous versions, the NQS manager must save the old NQS data and rebuild the NQS database.

The following process saves data about the NQS environmental parameters, managers, users and groups with restricted NQS access, queues, devices, and NQS complexes. The process also creates the script file remakenqs, to which the qmgr(1M) subcommands are written.

Example:

After saving the old data, delete /usr/spool/nqs and construct a new NQS database using nqsmkdbs. Next, start nqsdaemon and issue the following command.

Example:

This process rebuilds the old environment on the new NQS database. qmgr messages may appear in the remake_log file. If the rebuild fails, see this file. svnqsdbs cannot save data about NQS requests.

6.9 SNAP FUNCTION

You can use the snap function to save the contents of the NQS database, convert them to the subcommands of qmgr(1M), and write to the specified file. By using this function, you can restore your NQS database easily. For this function, use the snap file subcommand of qmgr(1M).

Example:

By this procedure, the contents of your NQS database is saved to the file /usr/nec/snapfile.

You can restore your NQS database by issuing qmgr(1M) with this file.

Example:

Note that you cannot save the data about NQS requests and restart files.

6.10 SAVING THE RESULT FILE

NQS outputs the results of a user request to a file specified by the request. If the file cannot be output, NQS tries to create the result file in the user's home directory on the machine that executed the request. If that does not work, the result file is saved under the /usr/spool/nqs/private/root/outfai directory. The names of files saved this way have the following format:

Home


6.11 COMMAND EXECUTION IN QMGR(1M)

In the subcommand mode of qmgr(1M), general commands can be executed by adding "!" in front of them.

Example:

6.12 NQS INITIAL ENVIRONMENT CONFIGURATION FILE

You can create the /usr/lib/nqs/nqs.conf file as the NQS initial environment configuration file. This file is not prepared by default. Create an nqs.conf file when your own site needs to specify either an NQS external process, user levels that can see all queues and user level that can see all user information, or an extra operation of NQS. You must set the owner as root and the mode as 0644. NQS reads the nqs.conf file only once when NQS starts. If you create or change the nqs.conf file while nqsdaemon is running, you must restart your nqsdaemon.

The following sections explain these specifications.

6.12.1 NQS External Process

An NQS external process is a process that is executed on the following time points in the execution cycle of an NQS request.

You can create an NQS external process freely according to your own site environment. In the nqs.conf file, you can specify "when" you execute "which" process.

Example:

This specification executes /usr/local/bin/aaa every time a request begins.

6.12.2 User Level that Can See All NQS Requests

In the nqs.conf file, you can set a user level that can see all queues by using qstat(1), qstatq(1), and qmgr(1), even if a queue is restricted to access. You can choose the user level among user, operator, and manager.

Example:

This specification allows all users to see all queue information by qstatq(1).

Home


6.12.3 User Level that Can See All User Information

In the nqs.conf file, you can set a user level that can see all user information by using qstat(1), qstatr(1), and qstata(1). You can choose the user level among user, operator, and manager.

Example:

This specification allows all users to see all queue requests by qstatr(1).

6.12.4 Extra Operation

You can set an extra operation of NQS in the nqs.conf file. The following keywords can be used.

Example:
    # cat /usr/lib/nqs/nqs.conf
    extra_operation {
        double-word
        inform-begintime
        named-restartfile
    }
    #

Home


6.13 JOB TRACKING

6.13.1 Tracking File

The information concerning each request submitted in the pipe queue of a machine is recorded in a file of that machine. This file is called a tracking file and stores information that identifies where the request was sent, executed, and terminated.

When a command is issued for a request from the machine to which that request was submitted, the command automatically searches through the tracking file of that machine to find the machine in which the specified request is resident. This eliminates the need for the user to know where requests are sent.

Applicable commands: qalter, qcat, qdel, qhold, qmove, qrerun, qrls, qwait, qstatck, and qstatr

Specify the -t level option together with qstatck and qstatr.

6.13.2 Information Retention Time of the Tracking File

The tracking file is also used for a queue waiting process, such as qwait or job network, for request execution and termination.

The information concerning a terminated request is retained for a specified time in the machine to which that request was submitted. This allows the user to identify the termination code of the request for some time after termination. Requests that take quite a long time to execute, such as job network, require longer information retention time. The information storage time must be longer than the time required for the queue waiting process.

The default is three days (259,200 seconds). Note that a longer retention time means a larger file size.

The information retention time can be specified in the following subcommand of qmgr.

Example:

6.13.3 Restoration of Machine Information

As a request is transferred from machine to machine, information may be lost if NQS is not operating on the machine to which that request was originally submitted. This can result in that machine having incorrect information about where the request is located.

To restore machine information, execute qstatr -t 3 after restarting the machine to which the request was originally submitted. This command finds the machine in which the request is resident and corrects the machine information.

6.14 Request Revival Function

6.14.1 Outline of Request Revival Function

Requests in which a hardware failure occurs are sometimes deleted, even if checkpoint has been obtained. And the requests in which a critical error occurs at restart time may also be deleted. The request revival function enables NQS to store the request information of the deleted request. The stored request is called a garbage request. When the cause of the hardware failure or the restart error is removed, the garbage request can be restarted,.

NQS manager or NQS operator privilege is necessary in order to operate the garbage request.

6.14.2 How to Use the Request Revival Function

The request revival function cannot be used at default setting for compatibility with the conventional NQS function. To use this function, the following keyword must be added to the NQS initial environment configuration file nqs.conf(4) and NQS must be rebooted.

6.14.3 Making Condition of Garbage Request

NQS stores the request which has a restart file as a garbage request, when it meets with the following problems.

CAUTION
  1. Before R10.1, all the requests which are killed by the signal are saved as a garbage request when the request revival function is available. From R10.1, an occurrence of a hardware failure is notified to NQS. Therefore, only the requests which met a hardware failure or failed to restart are saved as a garbage request.

  2. When the request, which met with a hardware failure, is canceled by the following commands, it is not saved as a garbage request.

  3. If NQS cannot get shutdown-checkpoint of the request which is non-restartable, it is not saved as a garbage request. The reason why the request cannot get shutdown-checkpoint is as follows.

    • The attribute in which checkpoint cannot be retrieved is set.
    • A fatal error occurs concerning getting checkpoint.

6.14.4 Referencing Garbage Requeset Information

The show garbage subcommand of qmgr(1M) is used to refer to information on the garbage request.

The meaning of each column is as follows.

"HW CHECK(HW CHECK flag)" is displayed to EXIT STATUS when the request meets a hardware failure. And "RESTART FAIL(detail error no.)" is displayed when restarting the request fails. The HW CHECK flag is as follows.

HW checkvalue
CPU check0x0001
IXS check0x0002

When both of the CPU and IXS failures occur, the logical sum of the values are displayed. When the request meets with the hardware and restart failures, the cause of the last failure is displayed.

Refer to Section 6.5.5, Error Codes of System Administrator's Guide for detailed error numbers.

The elapsed time is displayed by the form of HH:MM:SS(HH=hour,MM=minute,SS=second).

By specifying a queue name to an argument of the show garbage subcommand, the information on the garbage request of the specified queue is displayed.

Only the NQS manager and NQS operator may execute the garbage request operation. Therefore, if a general user executes the show garbage subcommand, an operation error occurs.

6.14.5 Reviving a Garbage Request

The pick_up garbage request subcommand of qmgr(1M) is used to revive the garbage request. Specify the request ID of the garbage request to be revived for the argument. Only one request can be revived at a time. When the request ID is not specified for the argument or request ID which does not exist as a garbage request is specified, it becomes an error.

The revived request is treated as a usual request. If reviving the request fails, it is saved as a garbage request again.

Only the NQS manager and NQS operator may execute the garbage request operation. Therefore, if a general user executes the pick_up garbage subcommand, an operation error occurs.

6.14.6 Deleting a Garbage Request

The clean_up garbage subcommand of qmgr(1M) is used to delete an unnecessary garbage request.

There are the following three ways to delete an unnecesary garbage request.

The subcommand of qmgr(1M) used to delete the garbage request depends on the deletion unit. The correspondence of the deletion unit and the subcommand is as follows.

Each request clean_up garbage request
Each queue clean_up garbage queue
The entire system clean_up garbage system

To delete each garbage request, use the the clean_up garbage request subcommand with specifying the request ID of the garbage request to be deleted for the argument. The request which can be specified at a time is only one. For example, when the garbage request whose request ID is 4913.host1" is deleted, the subcommand is executed as follows.

An error occurs when the request ID is not specified or the request ID of the request which does not exist as a garbage request is specified.

The clean_up garbage queue subcommand is used to delete all the garbage request in the queue. The queue name where the garbage requests to be deleted exist is specified for the argument. The queue name which can be specified at a time is only one. For example, when the garbage requests of the queue whose name is"batch1" are deleted, the subcommand is executed as follows.

An error occurs when the queue name is not specified or the queue name which does not exist is specified.

You can also delete only the garbage request, which has passed a certain time since it had been saved as a garbage request, by specifying the elapsed time to the second argument. The unit is hours.

To delete all the garbage requests, the clean_up garbage system subcommand is used.

You can also delete only the garbage request, which has passed a certain time since it had been saved as a garbage request, by specifying the elapsed time to the second argument. The unit is hours.

Only the NQS manager and NQS operator may execute the garbage request operation. Therefore, if a general user executes the clean_up garbage subcommand, an operation error occurs.

6.14.7 Notes on Using Request Revival Function

Home

Contents Previous Chapter Next Chapter Index