Use the oclumon dumpnodeview
command to view log information from the system monitor service in the form of a node view.
A node view is a collection of all metrics collected by CHM for a node at a point in time. CHM attempts to collect metrics every five seconds on every node. Some metrics are static while other metrics are dynamic.
A node view consists of eight views when you display verbose output:
SYSTEM: Lists system metrics such as CPU COUNT, CPU USAGE, and MEM USAGE
TOP CONSUMERS: Lists the top consuming processes in the following format:
metric_name: 'process_name(process_identifier) utilization'
PROCESSES: Lists process metrics such as PID, name, number of threads, memory usage, and number of file descriptors
DEVICES: Lists device metrics such as disk read and write rates, queue length, and wait time per I/O
NICS: Lists network interface card metrics such as network receive and send rates, effective bandwidth, and error rates
FILESYSTEMS: Lists file system metrics, such as total, used, and available space
You can generate a summary report that only contains the SYSTEM and TOP CONSUMERS views.
"Metric Descriptions" lists descriptions for all the metrics associated with each of the views in the preceding list.
Note:
Metrics displayed in the TOP CONSUMERS view are described in Table J-4.
Example J-1 shows an example of a node view.
Syntax
oclumon dumpnodeview [[-allnodes] | [-n node1 node2 noden] [-last "duration"] | [-s "time_stamp" -e "time_stamp"] [-i interval] [-v]] [-h]
Parameters
Table J-2 oclumon dumpnodeview Command Parameters
Parameter | Description |
---|---|
-allnodes |
Use this option to dump the node views of all the nodes in the cluster. |
-n node1 node2 |
Specify one node (or several nodes in a space-delimited list) for which you want to dump the node view. |
-last "duration" |
Use this option to specify a time, given in HH24:MM:SS format surrounded by double quotation marks ( "23:05:00" |
-s "time_stamp" -e "time_stamp" |
Use the "2011-05-10 23:05:00" Note: You must specify these two options together to obtain a range. |
-i interval |
Specify a collection interval, in five-second increments. |
-v |
Displays verbose node view output. |
-h |
Displays online help for the |
Usage Notes
In certain circumstances, data can be delayed for some time before it is replayed by this command. For example, the crsctl stop cluster -all
command can cause data delay. After running crsctl start cluster -all
, it may take several minutes before oclumon dumpnodeview
shows any data collected during the interval.
The default is to continuously dump node views. To stop continuous display, use Ctrl+C on Linux and Windows.
Both the local system monitor service (osysmond
) and the cluster logger service (ologgerd
) must be running to obtain node view dumps.
Examples
The following example dumps node views from node1
, node2
, and node3
collected over the last twelve hours:
$ oclumon dumpnodeview -n node1 node2 node3 -last "12:00:00"
The following example displays node views from all nodes collected over the last fifteen minutes at a 30 second interval:
$ oclumon dumpnodeview -allnodes -last "00:15:00" -i 30
Metric Descriptions
This section includes descriptions of the metrics in each of the seven views that comprise a node view listed in the following tables.
Table J-3 SYSTEM View Metric Descriptions
Metric | Description |
---|---|
#pcpus |
The number of physical CPUs |
#vcpus |
Number of logical compute units |
cpuht |
CPU hyperthreading enabled (Y) or disabled (N) |
chipname |
The name of the CPU vendor |
cpu |
Average CPU utilization per processing unit within the current sample interval (%). |
cpuq |
Number of processes waiting in the run queue within the current sample interval |
physmemfree |
Amount of free RAM (KB) |
physmemtotal |
Amount of total usable RAM (KB) |
mcache |
Amount of physical RAM used for file buffers plus the amount of physical RAM used as cache memory (KB) On Windows systems, this is the number of bytes currently being used by the file system cache Note: This metric is not available on Solaris. |
swapfree |
Amount of swap memory free (KB) |
swaptotal |
Total amount of physical swap memory (KB) |
hugepagetotal |
Total size of huge in KB Note: This metric is not available on Solaris or Windows systems. |
hugepagefree |
Free size of huge page in KB Note: This metric is not available on Solaris or Windows systems. |
hugepagesize |
Smallest unit size of huge page Note: This metric is not available on Solaris or Windows systems. |
ior |
Average total disk read rate within the current sample interval (KB per second) |
iow |
Average total disk write rate within the current sample interval (KB per second) |
ios |
I/O operation average time to serve I/O request |
swpin |
Average swap in rate within the current sample interval (KB per second) Note: This metric is not available on Windows systems. |
swpout |
Average swap out rate within the current sample interval (KB per second) Note: This metric is not available on Windows systems. |
pgin |
Average page in rate within the current sample interval (pages per second) |
pgout |
Average page out rate within the current sample interval (pages per second) |
netr |
Average total network receive rate within the current sample interval (KB per second) |
netw |
Average total network send rate within the current sample interval (KB per second) |
procs |
Number of processes |
procsoncpu |
The current number of processes running on the CPU |
rtprocs |
Number of real-time processes |
rtprocsoncpu |
The current number of real-time processes running on the CPU |
#fds |
Number of open file descriptors or Number of open handles on Windows |
#sysfdlimit |
System limit on number of file descriptors Note: This metric is not available on either Solaris or Windows systems. |
#disks |
Number of disks |
#nics |
Number of network interface cards |
nicErrors |
Average total network error rate within the current sample interval (errors per second) |
Table J-4 PROCESSES View Metric Descriptions
Metric | Description |
---|---|
name |
The name of the process executable |
pid |
The process identifier assigned by the operating system |
#procfdlimit |
Limit on number of file descriptors for this process Note: This metric is not available on Windows, AIX, and HP-UX systems. |
cpuusage |
Process CPU utilization (%) Note: The utilization value can be up to 100 times the number of processing units. |
privmem |
Process private memory usage (KB) |
shm |
Process shared memory usage (KB) Note: This metric is not available on Windows, Solaris, and AIX systems. |
workingset |
Working set of a program (KB) Note: This metric is only available on Windows. |
#fd |
Number of file descriptors open by this process or Number of open handles by this process on Windows |
#threads |
Number of threads created by this process |
priority |
The process priority |
nice |
The nice value of the process Note: This metric is not applicable to Windows systems. |
state |
The state of the process Note: This metric is not applicable to Windows systems. |
Table J-5 DEVICES View Metric Descriptions
Metric | Description |
---|---|
ior |
Average disk read rate within the current sample interval (KB per second) |
iow |
Average disk write rate within the current sample interval (KB per second) |
ios |
Average disk I/O operation rate within the current sample interval (I/O operations per second) |
qlen |
Number of I/O requests in wait state within the current sample interval |
wait |
Average wait time per I/O within the current sample interval (msec) |
type |
If applicable, identifies what the device is used for. Possible values are |
Table J-6 NICS View Metric Descriptions
Metric | Description |
---|---|
netrr |
Average network receive rate within the current sample interval (KB per second) |
netwr |
Average network sent rate within the current sample interval (KB per second) |
neteff |
Average effective bandwidth within the current sample interval (KB per second) |
nicerrors |
Average error rate within the current sample interval (errors per second) |
pktsin |
Average incoming packet rate within the current sample interval (packets per second) |
pktsout |
Average outgoing packet rate within the current sample interval (packets per second) |
errsin |
Average error rate for incoming packets within the current sample interval (errors per second) |
errsout |
Average error rate for outgoing packets within the current sample interval (errors per second) |
indiscarded |
Average drop rate for incoming packets within the current sample interval (packets per second) |
outdiscarded |
Average drop rate for outgoing packets within the current sample interval (packets per second) |
inunicast |
Average packet receive rate for unicast within the current sample interval (packets per second) |
type |
Whether PUBLIC or PRIVATE |
innonunicast |
Average packet receive rate for multi-cast (packets per second) |
latency |
Estimated latency for this network interface card (msec) |
Table J-7 FILESYSTEMS View Metric Descriptions
Metric | Description |
---|---|
total |
Total amount of space (KB) |
mount |
Mount point |
type |
File system type, whether local file system, NFS, or other |
used |
Amount of used space (KB) |
available |
Amount of available space (KB) |
used% |
Percentage of used space (%) |
ifree% |
Percentage of free file nodes (%) Note: This metric is not available on Windows systems. |
Table J-8 PROTOCOL ERRORS View Metric Descriptions
Metric | Description |
---|---|
IPHdrErr |
Number of input datagrams discarded due to errors in their IPv4 headers |
IPAddrErr |
Number of input datagrams discarded because the IPv4 address in their IPv4 header's destination field was not a valid address to be received at this entity |
IPUnkProto |
Number of locally-addressed datagrams received successfully but discarded because of an unknown or unsupported protocol |
IPReasFail |
Number of failures detected by the IPv4 reassembly algorithm |
IPFragFail |
Number of IPv4 discarded datagrams due to fragmentation failures |
TCPFailedConn |
Number of times that TCP connections have made a direct transition to the CLOSED state from either the SYN-SENT state or the SYN-RCVD state, plus the number of times that TCP connections have made a direct transition to the LISTEN state from the SYN-RCVD state |
TCPEstRst |
Number of times that TCP connections have made a direct transition to the CLOSED state from either the ESTABLISHED state or the CLOSE-WAIT state |
TCPRetraSeg |
Total number of TCP segments retransmitted |
UDPUnkPort |
Total number of received UDP datagrams for which there was no application at the destination port |
UDPRcvErr |
Number of received UDP datagrams that could not be delivered for reasons other than the lack of an application at the destination port |
Table J-9 CPUS View Metric Descriptions
Metric | Description |
---|---|
cpuid |
Virtual CPU |
sys-usage |
CPU usage in system space |
user-usage |
CPU usage in user space |
nice |
Value of NIC for a specific CPU |
usage |
CPU usage for a specific CPU |
iowait |
CPU wait time for I/O operations |
Example J-1 Sample Node View
----------------------------------------
Node: rwsak10 Clock: '14-04-16 18.47.25 PST8PDT' SerialNo:155631
----------------------------------------
SYSTEM:
#pcpus: 2 #vcpus: 24 cpuht: Y chipname: Intel(R) cpu: 1.23 cpuq: 0
physmemfree: 8889492 physmemtotal: 74369536 mcache: 55081824 swapfree: 18480404
swaptotal: 18480408 hugepagetotal: 0 hugepagefree: 0 hugepagesize: 2048 ior: 132
iow: 236 ios: 23 swpin: 0 swpout: 0 pgin: 131 pgout: 235 netr: 72.404
netw: 97.511 procs: 969 procsoncpu: 6 rtprocs: 62 rtprocsoncpu N/A #fds: 32640
#sysfdlimit: 6815744 #disks: 9 #nics: 5 nicErrors: 0
TOP CONSUMERS:
topcpu: 'osysmond.bin(30981) 2.40' topprivmem: 'oraagent.bin(14599) 682496'
topshm: 'ora_dbw2_oss_3(7049) 2156136' topfd: 'ocssd.bin(29986) 274'
topthread: 'java(32255) 53'
CPUS:
cpu18: sys-2.93 user-2.15 nice-0.0 usage-5.8 iowait-0.0 steal-0.0
.
.
.
PROCESSES:
name: 'osysmond.bin' pid: 30891 #procfdlimit: 65536 cpuusage: 2.40 privmem: 35808
shm: 81964 #fd: 119 #threads: 13 priority: -100 nice: 0 state: S
.
.
.
DEVICES:
sdi ior: 0.000 iow: 0.000 ios: 0 qlen: 0 wait: 0 type: SYS
sda1 ior: 0.000 iow: 61.495 ios: 629 qlen: 0 wait: 0 type: SYS
.
.
.
NICS:
lo netrr: 39.935 netwr: 39.935 neteff: 79.869 nicerrors: 0 pktsin: 25
pktsout: 25 errsin: 0 errsout: 0 indiscarded: 0 outdiscarded: 0
inunicast: 25 innonunicast: 0 type: PUBLIC
eth0 netrr: 1.412 netwr: 0.527 neteff: 1.939 nicerrors: 0 pktsin: 15
pktsout: 4 errsin: 0 errsout: 0 indiscarded: 0 outdiscarded: 0
inunicast: 15 innonunicast: 0 type: PUBLIC latency: <1
FILESYSTEMS:
mount: / type: rootfs total: 563657948 used: 78592012 available: 455971824
used%: 14 ifree%: 99 GRID_HOME
.
.
.
PROTOCOL ERRORS:
IPHdrErr: 0 IPAddrErr: 0 IPUnkProto: 0 IPReasFail: 0 IPFragFail: 0
TCPFailedConn: 5197 TCPEstRst: 717163 TCPRetraSeg: 592 UDPUnkPort: 103306
UDPRcvErr: 70