Content-type: text/html
Manpage of QSTAT
QSTAT
Section: Sun Grid Engine User Commands (1)
Updated: 2003/06/04 12:13:23
Index
Return to Main Contents
NAME
qstat - show the status of Sun Grid Engine jobs and queues
SYNTAX
qstat
[
-ext
] [
-f
] [ -F [resource_name,...]
] [
-g d
] [
-help
] [
-j [job_list]
] [
-l resource=val,...
] [
-ne
] [
-pe pe_name,...
] [
-q queue,...
] [
-r
] [
-s {r|p|s|z|hu|ho|hs|hj|ha|h}[+]]
] [
-t
] [
-U user,...
] [
-u user,...
]
DESCRIPTION
qstat
shows the current status of the available Sun Grid Engine queues and the
jobs associated with the queues. Selection options allow you
to get information about specific jobs, queues or users.
Without any option
qstat
will display only a list of jobs with no queue status
information.
OPTIONS
- -alarm
-
Displays the reason(s) for queue alarm states. Outputs one line per reason containing
the resource value and threshold. For details about the resource value please
refer to the description of the Full Format in section OUTPUT FORMATS below.
- -ext
-
This option is only supported in case of a Sun Grid Engine, Enterprise Edition system. It is not available
for Sun Grid Engine systems.
Displays additional Sun Grid Engine, Enterprise Edition relevant information for each job (see OUTPUT
FORMATS below).
- -f
-
Specifies a "full" format display of information.
The -f option causes summary
information on all queues to be displayed along with the
queued job list.
- -F [ resource_name,... ]
-
Like in the case of -f information is displayed on all jobs as well as
queues. In addition,
qstat
will present a detailed listing of the current
resource availability per queue with respect to all resources (if the option
argument is omitted) or with respect to those resources contained in the
resource_name list. Please refer to the description of the
Full Format in
section OUTPUT FORMATS below for further detail.
- -g d
-
Displays array jobs verbosely in a one line per job task fashion. By
default, array jobs are grouped and all tasks with the same status (for
pending tasks only) are displayed in a single line. The array job task
id range field in the output (see section
OUTPUT FORMATS) specifies the
corresponding set of tasks.
The -g switch currently has only the single option argument
d. Other option arguments are reserved for future extensions.
- -help
-
Prints a listing of all options.
- -j [job_list]
-
Prints either for all pending jobs or the jobs contained in job_list the
reason for not being scheduled.
- -l resource[=value],...
-
Defines the resources required by the jobs or granted
by the queues on which information is requested.
Matching is performed on queues. The pending jobs are
restricted to jobs that might run in one of the above
queues.
- -ne
-
In combination with -f the option suppresses the display of empty
queues. This means all queues where actually no jobs are running are not
displayed.
- -pe pe_name,...
-
Displays status information with respect to queues which are attached to
at least one of the parallel environments enlisted in the comma separated
option argument. Status information for jobs is displayed either for those
which execute in one of the selected queues or which are pending and
might get scheduled to those queues in principle.
- -q queue,...
-
Specifies the queue to which job
information is to be displayed.
- -r
-
Prints extended information about the resource requirements
of the displayed jobs. Please refer to the OUTPUT FORMATS
sub-section Expanded Format below for detailed information.
- -s {p|r|s|z|hu|ho|hs|hj|ha|h}[+]
-
Prints only jobs in the specified state, any combination of states is
possible. -s prs corresponds to the regular
qstat
output without -s
at all. To show recently finished jobs, use -s z.
To display jobs in user/operator/system hold,
use the -s hu/ho/hs
option. The
-s ha option shows jobs which where
submitted with the
qsub
-a command.
qstat
-s hj
displays all jobs which are not eligible for execution unless the job
has entries in the job dependency list. (see -a
and -hold_jid option to
- -t
-
Prints extended information about the controlled sub-tasks
of the displayed parallel jobs. Please refer to the OUTPUT FORMATS
sub-section Expanded Format below for detailed information. Sub-tasks
of parallel jobs should not be confused with array job tasks (see -g
option above and -t option to
- -U user,...
-
Displays status information with respect to queues to which the specified
users have access. Status information for jobs is displayed either for those
which execute in one of the selected queues or which are pending and
might get scheduled to those queues in principle.
- -u user,...
-
Display information only on those jobs and queues
being associated with the users from the given user list.
Queue status information is displayed if the -f or -F
options are specified additionally and if the user runs
jobs in those queues.
OUTPUT FORMATS
Depending on the presence or absence of the -alarm, -f or -F and
-r and -t option three output formats need to be differentiated.
PP
In case of a Sun Grid Engine, Enterprise Edition system, the -ext option may be used to display
additional information for each job.
Reduced Format (without -f and -F)
Following the header line a line is printed for each job
consisting of
- *
-
the job ID.
- *
-
the priority of the jobs as assigned to them via the -p
option to
or
determining the order of the pending jobs list.
- *
-
the name of the job.
- *
-
the user name of the job owner.
- *
-
the status of the job - one of d(eletion), t(ransfering),
r(unning), R(estarted), s(uspended), S(uspended), T(hreshold), w(aiting) or
h(old).
The state d(eletion) indicates that a
has been used to initiate job deletion.
The states t(ransfering) and r(unning) indicate that a job is about to
be executed or is already executing, whereas the states s(uspended),
S(uspended) and T(hreshold) show that an already running jobs has been
suspended. The s(uspended) state is caused by suspending the job via the
command, the S(uspended) state indicates that the queue containing the job
is suspended and therefore the job is also suspended and the T(hreshold)
state shows that at least one suspend threshold of the corresponding queue
was exceeded (see
and that the job has been suspended as a consequence. The state R(estarted)
indicates that the job was restarted. This can be caused by a job migration or
because of one of the reasons described in the -r section of the
command.
The states w(aiting) and h(old) only appear for pending jobs. The h(old)
state indicates that a job currently is not eligible for execution due to
a hold state assigned to it via
or the
-h option or that the job is waiting for completion of the jobs
to which job dependencies have been assigned to the job via the
-hold_jid option of
or
- *
-
the submission or start time and date of the job.
- *
-
the queue the job is assigned to (for running or suspended
jobs only).
- *
-
the function of the running jobs (MASTER or SLAVE - the latter
for parallel jobs only).
- *
-
the array job task id. Will be empty for non-array jobs. See the
-t option to
and the -g above for additional information.
If the -t option is supplied, each job status line also contains
- *
-
the parallel task ID (do not confuse parallel tasks with array job tasks),
- *
-
the status of the parallel task - one of
r(unning), R(estarted), s(uspended), S(uspended), T(hreshold), w(aiting),
h(old), or x(exited).
- *
-
the cpu, memory, and I/O usage (Sun Grid Engine, Enterprise Edition only),
- *
-
the exit status of the parallel task,
- *
-
and the failure code and message for the parallel task.
Full Format (with -f and -F)
Following the header line a section for each queue separated
by a horizontal line is provided. For each queue the information
printed consists of
- *
-
the queue name,
- *
-
the queue type - one of B(atch), I(nteractive), C(heckpointing),
P(arallel), T(ransfer) or combinations thereof,
- *
-
the number of used and available job slots,
- *
-
the load average of the queue host,
- *
-
the architecture of the queue host and
- *
-
the state of the queue - one of
u(nknown) if the corresponding
cannot be contacted, a(larm), A(larm), C(alendar suspended), s(uspended),
S(ubordinate), d(isabled), D(isabled), E(rror) or
combinations thereof.
If the state is a(larm) at least on of the load thresholds defined in the
load_thresholds list of the queue configuration (see
is
currently exceeded, which prevents from scheduling further jobs to that
queue.
As opposed to this, the state A(larm) indicates that at least one of the
suspend thresholds of the queue (see
is currently exceeded. This will result in jobs running in that queue being
successively suspended until no threshold is violated.
The states s(uspended) and d(isabled) can be assigned to queues and
released via the
command. Suspending a queue will cause all jobs executing in that queue to
be suspended.
The states D(isabled) and C(alendar suspended) indicate that the queue
has been disabled or suspended automatically via the calendar facility of
Sun Grid Engine (see
while the S(ubordinate) state
indicates, that the queue has been suspend via subordination to another
queue (see
for details). When suspending a queue
(regardless of the cause) all jobs executing in that queue are suspended
too.
If an E(rror) state is displayed for a queue,
on that host was unable to locate the
executable
on that host in order to start a job. Please check the
error logfile of that
for leads on how to resolve the problem. Please enable the
queue afterwards via the -c option of the
command manually.
If the -F option was used, resource availability information is printed
following the queue status line. For each resource (as selected in an option
argument to -F or for all resources if the option argument was
omitted) a single line is displayed with the following format:
- *
-
a one letter specifier indicating whether the current resource availability
value was dominated by either
`g' - a cluster global,
`h' - a host total or
`q' - a queue related resource consumption.
- *
-
a second one letter specifier indicating the source for the current resource
availability value, being one of
`l' - a load value reported for the
resource,
`L' - a load value for the resource after administrator
defined load scaling has been applied,
`c' - availability derived from
the consumable resources facility (see
`v' - a default complexes configuration value
never overwritten by a load report or a consumable update or
`f' - a fixed
availability definition derived from a non-consumable complex attribute or
a fixed resource limit.
- *
-
after a colon the name of the resource on which information is displayed.
- *
-
after an equal sign the current resource availability value.
The displayed availability values and the sources from which they derive are
always the minimum values of all possible combinations. Hence, for example,
a line of the form "qf:h_vmem=4G" indicates that a queue currently has a
maximum availability in virtual memory of 4 Gigabyte, where this value is a
fixed value (e.g. a resource limit in the queue configuration) and it is queue
dominated, i.e. the host in total may have more virtual memory available than
this, but the queue doesn't allow for more. Contrarily a line "hl:h_vmem=4G"
would also indicate an upper bound of 4 Gigabyte virtual memory
availability, but the limit would be derived from a load value currently
reported for the host. So while the queue might allow for jobs with higher
virtual memory requirements, the host on which this particular queue resides
currently only has 4 Gigabyte available.
If the -alarm option was used, information about resources is displayed, that
violate load or suspend thresholds.
The same format as with the -F option is used with following extensions:
- *
-
the line starts with the keyword `alarm'
- *
-
appended to the resource value is the type and value of the appropriate threshold
After the queue status line (in case of -f) or the resource
availability information (in case of -F) a single line is printed
for each job running currently in this queue. Each job status
line contains
- *
-
the job ID,
- *
-
the job name,
- *
-
the job owner name,
- *
-
the status of the job - one of t(ransfering),
r(unning), R(estarted), s(uspended), S(uspended) or T(hreshold) (see the
Reduced Format section for detailed information),
- *
-
the start date and time and the function of the job (MASTER
or SLAVE - only meaningful in case of a parallel job) and
- *
-
the priority of the jobs.
If the -t option is supplied, each job status line also contains
- *
-
the task ID,
- *
-
the status of the task - one of
r(unning), R(estarted), s(uspended), S(uspended), T(hreshold), w(aiting),
h(old), or x(exited) (see the
Reduced Format section for detailed information),
- *
-
the cpu, memory, and I/O usage (Sun Grid Engine, Enterprise Edition only),
- *
-
the exit status of the task,
- *
-
and the failure code and message for the task.
Following the list of queue sections a PENDING JOBS list may
be printed in case jobs are waiting for being assigned to a queue.
A status line for each waiting job is displayed being similar to
the one for the running jobs. The differences are that the status
for the jobs is w(aiting) or h(old), that the submit time and date
is shown instead of the start time and that no function
is displayed for the jobs.
In very rare cases, e.g. if
starts up from an inconsistent state in the job or queue spool
files or if the clean queue (-cq) option of
is used,
qstat
cannot assign jobs to either the running or pending jobs section
of the output. In this case as job status inconsistency (e.g. a
job has a running status but is not assigned to a queue) has been
detected. Such jobs are printed in an ERROR JOBS section at the
very end of the output. The ERROR JOBS section should disappear
upon restart of
Please contact your Sun Grid Engine support representative if you feel
uncertain about the cause or effects of such jobs.
Expanded Format (with -r)
If the -r option was specified together with qstat,
the following information for each displayed job is printed (a single line
for each of the following job characteristics):
- *
-
The job and master queue name.
- *
-
The hard and soft resource requirements of the job as specified
with the
-l option.
- *
-
The requested parallel environment including the
desired queue slot range (see -pe option of
- *
-
The requested checkpointing environment of the job (see the
-ckpt option).
- *
-
In case of running jobs, the granted
parallel environment with the granted number of queue slots.
Enhanced Sun Grid Engine, Enterprise Edition Output (with -ext)
For each job the following additional items are displayed:
- project
-
The project to which the job is assigned as specified in the
-P option.
- department
-
The department, to which the user belongs (use the -sul and
-su options of
to display the current department definitions).
- deadline
-
The deadline initiation time of the job as specified with the
-dl option.
- cpu
-
The current accumulated CPU usage of the job.
- mem
-
The current accumulated memory usage of the job.
- io
-
The current accumulated IO usage of the job.
- tckts
-
The total number of tickets assigned to the job currently
- ovrts
-
The override tickets as assigned by the -ot option of
- otckt
-
The override portion of the total number of tickets assigned to the
job currently
- dtckt
-
The deadline portion of the total number of tickets assigned to the
job currently
- ftckt
-
The functional portion of the total number of tickets assigned to the
job currently
- stckt
-
The share portion of the total number of tickets assigned to the
job currently
- share
-
The share of the total system to which the job is entitled currently.
ENVIRONMENTAL VARIABLES
- SGE_ROOT
-
Specifies the location of the Sun Grid Engine standard configuration
files.
- SGE_CELL
-
If set, specifies the default Sun Grid Engine cell. To address a Sun Grid Engine
cell
qstat
uses (in the order of precedence):
-
-
The name of the cell specified in the environment
variable SGE_CELL, if it is set.
The name of the default cell, i.e. default.
- SGE_DEBUG_LEVEL
-
If set, specifies that debug information
should be written to stderr. In addition the level of
detail in which debug information is generated is defined.
- COMMD_PORT
-
If set, specifies the tcp port on which
is expected to listen for communication requests.
- SGE_LONG_QNAMES
-
If set, all queue names will be displayed in there full length.
Most installations will use a services map entry instead
to define that port.
- COMMD_HOST
-
If set, specifies the host on which the particular
to be used for Sun Grid Engine communication of the
qstat
client resides.
Per default the local host is used.
FILES
<sge_root>/<cell>/common/act_qmaster
Sun Grid Engine master host file
SEE ALSO
COPYRIGHT
See
for a full statement of rights and permissions.
Index
- NAME
-
- SYNTAX
-
- DESCRIPTION
-
- OPTIONS
-
- OUTPUT FORMATS
-
- Reduced Format (without -f and -F)
-
- Full Format (with -f and -F)
-
- Expanded Format (with -r)
-
- Enhanced Sun Grid Engine, Enterprise Edition Output (with -ext)
-
- ENVIRONMENTAL VARIABLES
-
- FILES
-
- SEE ALSO
-
- COPYRIGHT
-
This document was created by
man2html,
using the manual pages.
Time: 17:34:28 GMT, September 12, 2003