Skip Headers

Oracle® Enterprise Manager Metric Reference Manual
10g Release 1 (10.1)
Part No. B12015-01
  Go To Table Of Contents
Contents

Previous  

44 OMS and Repository

The OMS and Repository target exposes metrics that are useful for monitoring the Oracle Enterprise Manager Management Service (OMS) and Management Repository.

44.1 Notification Status

This is a Management Agent metric intended to send out of band notifications when the Notification system is determined to be in a critical state.

44.1.1 DBMS Job Bad Schedule

This metric flags a DBMS job whose schedule is invalid. A schedule is marked 'Invalid' if it is scheduled for more than one hour in the past, or more than one year in the future. An invalid schedule means that the job is in serious trouble.

44.1.1.1 Data Source

The user_jobs.next_time table in the Management Repository.

44.1.1.2 User Action

If the job schedule is invalid, the DBMS job should be restarted. To do this:

  1. Copy down the DBMS Job Name that is down from the row in the table. This DBMS Job Name is 'yourDBMSjobname' in the following example.

  2. Log onto the database as the repository owner.

  3. Issue the following SQL statement:

    select dbms_jobname from mgmt_performance_names where display_name='yourDBMSjobname';

  4. If the dbms_jobname is 'myjob', then issue the following SQL statement:

    select job from all_jobs where what='myjob';

  5. Copy down the jobid.

  6. Force the job into the broken state so that it can be restarted by specifying the following DBMS job command and parameters:

    dbms_job.broken(jobid,true)

  7. Verify that the job has been marked as broken by using this SQL statement:

    select what, broken from all_jobs where broken='Y';

    You should see the job in the results.

  8. Once you've verified that the DBMS job is marked broken, restart the job with the following DBMS job command and parameters:

    dbms_job.run(jobid)

44.1.2 DBMS Job Processing Time, % of Last Hour

The percentage of the past hour the job has been running.

44.1.2.1 Data Source

The mgmt_system_performance_log table in the Management Repository.

44.1.2.2 User Action

If the value of this metric is greater than 50%, then there may be a problem with the job. Check the System Errors page for errors reported by the job. Check the Alerts log for any alerts related to the job.

44.1.3 DBMS Job UpDown

The down condition equates to the dbms_job "broken" state. The Up Arrow means not broken.

44.1.3.1 Data Source

The broken column is from the all_users table in the Management Repository.

44.1.3.2 User Action

Determine the reason for the dbms job failure. Once the reason for the failure has been determined and corrected, the job can be restarted through the dbms_job.run command.

To determine the reason the dbms job failed, take the following steps (replacing myjob with the displayed name of the down job):

  1. Copy down the DBMS Job Name that is down from the row in the table. This DBMS Job Name is 'yourDBMSjobname' in the following example.

  2. Log onto the database as the repository owner.

  3. Issue the following SQL statement:

    select dbms_jobname from mgmt_performance_names where display_name='yourDBMSjobname';

  4. If the dbms_jobname is 'myjob', then issue the following SQL statement:

    select job from all_jobs where what='myjob';

  5. Using the job id returned, look for ORA-12012 messages for this jobid in the alerts log and trace files and try to determine and correct the problem.

The job can be manually restarted through the following database command:

execute dbms_job.run (jobid);

44.1.4 Files Pending Load

The number of files waiting for the loader to process, sampled every 10 minutes.

44.1.4.1 Data Source

This metric is obtained using the following query of the mgmt_oms_parameters table in the Management Repository.

SELECT value FROM mgmt_oms_parameters 
  where name='loaderFileCount'

44.1.4.2 User Action

If the Files Pending Load number is increasing steadily over a period of time, you may consider one of these options:

  • Increasing the number of background threads.

  • Adding another Management Service and pointing some of the Management Agents to the new Management Service.

44.1.5 Job Dispatcher Job Step Average Backlog

The number of job steps that were ready to be scheduled but could not be because all the dispatchers were busy.

When this number grows steadily, it means the job scheduler is not able to keep up with the workload.

44.1.5.1 Data Source

The value of the mgmt_oms_parameters table in the Management Repository where the host_url is the host_url of the Management Service and the parameter_name column is jobStepCount.

44.1.5.2 User Action

This value is updated by the job dispatcher before its periodic wait. If the graph of this number increases steadily over time, the user should take one of the following actions:

  • Increase the em.jobs.shortPoolSize, em.jobs.longPoolSize, and em.jobs.systemPoolSize properties in the web.xml file. The web.xml file specifies the number of threads allocated to process different types of job steps. The short pool size should be larger than the long pool size.

    Property Defaule value Recommended value Description
    em.jobs.shortPoolSize 10 10 - 50 Steps taking less than 15 minutes
    em.jobs.longPoolSize 8 8 - 30 Steps taking more than 15 minutes
    em.jobs.systemPoolSize 8 8 - 20 Internal jobs (e.g. agent ping)

  • Add another Management Service on a different host.

Check the job step contents to see if they can be made more efficient.

44.1.6 Job Dispatcher Processing Time, % of Last Hour

The job dispatcher is responsible for scheduling jobs as required. It starts up periodically and checks if jobs need to be run. If job dispatcher is running more than the threshold levels, then it is having problems handling the job load.

44.1.6.1 Data Source

This is the sum of the amount of time the job has run over the last hour from the mgmt_system_performance_log table in the Management Repository divided by one hour, multiplied by 100 to arrive at the percent.

44.1.6.2 User Action

Specific to your site.

44.1.7 Last Error

Timestamp of the latest error for the job.

44.1.7.1 Data Source

The mgmt_system_error_log table in the Management Repository.

44.1.7.2 User Action

Specific to your site.

44.1.8 Last Load Error

Timestamp from the last error. If the loader is not reading file correctly, the system may not be receiving data correctly from the Management Agent.

44.1.8.1 Data Source

The mgmt_system_error_log table in the Management Repository.

44.1.8.2 User Action

Specific to your site.

44.1.9 Loader Directory

The directory from which the loader is getting files.

44.1.9.1 Data Source

This metric is obtained using the following query of the mgmt_oms_parameters table in the Management Repository.

SELECT value FROM mgmt_oms_parameters
  where name='loaderDirectory'

44.1.9.2 User Action

If the loader directory is out of space, you may want to look for the error files to investigate the problem.

44.1.10 Loader Name

The unique name of the loader, consisting of the Management Service name separated by a comma from the loader name on that Management Service.

44.1.10.1 Data Source

The mgmt_system_performance_log table in the Management Repository.

44.1.10.2 User Action

None. This is the key field for the metric.

44.1.11 Loader Throughput (rows per hour)

This is the number of lines of XML text processed by the loader thread over the past hour.

44.1.11.1 Data Source

The mgmt_system_performance_log table in the Management Repository.

44.1.11.2 User Action

If this number continues to rise over time, then the user may want to consider adding another Management Service or increasing the number of loader threads for this Management Service. To increase the number of loader threads, add or change the em.loader.threadPoolSize entry in the emoms.properties file. The default number of threads is 2. Values between 2 and 10 are common.

44.1.12 Loader Throughput (rows per second)

This is the number of lines of XML text processed by the loader thread per second averaged over the past hour.

44.1.12.1 Data Source

The mgmt_system_performance_log table in the Management Repository.

44.1.12.2 User Action

This metric is informational only.

44.1.13 Management Service Status

Shows whether the Management Service is up or down.

44.1.13.1 Data Source

The mgmt_oms_parameters and mgmt_failover_table tables in the Management Repository.

44.1.13.2 User Action

If the Management Service is down, start it.

44.1.14 Message

This metric lists targets for which the Management Agent has not uploaded data in the past two hours (excluding Management Agent, Beacon and Repository targets).

The alert is generated each time the Message content changes. The Message content changes each time the list of targets not uploading data changes.

44.1.14.1 Data Source

The mgmt_current_availability table in the Management Repository.

44.1.14.2 User Action

Perform the following steps:

  1. Determine the Management Agent for the target having problems.

  2. Verify that the target collection schedule is under 2 hour interval.

  3. Check the agent logs for errors uploading data.

  4. Check the Management System Errors page for Loader errors processing information from the Management Agent concerned.

44.1.15 Next Scheduled Runtime

The next scheduled runtime for the job.

44.1.15.1 Data Source

The user_jobs.next_date table in the Management Repository.

44.1.15.2 User Action

Specific to your site.

44.1.16 Notification Delivery Time

The time it took to deliver a notification, averaged over the past hour.

44.1.16.1 Data Source

The mgmt_system_performance_log table in the Management Repository.

44.1.16.2 User Action

If the average delivery time is steadily increasing, verify that the notification methods specified are valid. Remove any unnecessary or out of date notification rules and schedules.

44.1.17 Notification Processing Time, % of Last Hour

The percentage of the past hour that Notification delivery has been running.

44.1.17.1 Data Source

The mgmt_system_performance_log table in the Management Repository.

44.1.17.2 User Action

If the average delivery time is steadily increasing, verify that the notification methods specified are valid. Remove any unnecessary or out of date notification rules and schedules.

44.1.18 Notification UpDown

Displays whether the notification DBMS job (which processes severities to determine if notifications are required) is up or down.

44.1.18.1 Data Source

The user_jobs table in the Management Repository.

44.1.18.2 User Action

Determine the reason for the DBMS job failure. Once the reason for the failure has been determined and corrected, the job can be restarted through the dbms_job.run command.

To determine why the DBMS job failed, take the following steps:

  1. Log onto the database as the Management Repository owner.

  2. Issue the following SQL statement:

    select job from all_jobs where what like '%CHECK_FOR_SEVERITIES%';

  3. Using the job id returned, look for ORA-12012 messages for this jobid in the alerts log and trace files and try to determine and correct the problem.

  4. Issue the following DBMS job command and parameters:

    execute dbms_job.run (jobid);

44.1.19 Notifications Processed

The total number of notifications delivered by the Management Service over the previous 10 minutes.

44.1.19.1 Data Source

The mgmt_system_performance_log table in the Management Repository.

44.1.19.2 User Action

If the number of notifications processed is continually increasing over several days, then you may want to consider adding another Management Service.

44.1.20 Notifications Waiting

When this metric becomes critical, an out of band notification will be sent to the address specified during the installation.

44.1.20.1 Data Source

This is the sum of the amount of time the job has run over the last hour from the mgmt_system_performance_log table in the Management Repository divided by one hour, multiplied by 100 to arrive at the percent.

44.1.20.2 User Action

Perform the following user actions:

  1. Check the Errors page for errors logged by the Notification Delivery dbms job.

  2. Check the number of notification rules defined and verify that they are all necessary, removing those that are not.

  3. Verify that the addresses being used for the notifications are correct.

44.1.21 Number of Duplicate Targets

The count of duplicate targets in the Management Repository.

44.1.21.1 Data Source

The mgmt_duplicate_targets table in the Management Repository.

44.1.21.2 User Action

Go to the Duplicate Targets page by clicking the Duplicate targets link on the Management System Overview page. The Duplicate targets link only appears on the Management System Overview page if there are problems involving duplicate targets.

Resolve the conflict by removing the duplicate target from the conflicting Management Agent.

44.1.22 Number of Groups

The number of groups defined for Enterprise Manager.

44.1.22.1 Data Source

The mgmt_targets table in the Management Repository.

44.1.22.2 User Action

If you have a problem viewing the All Targets page, you may want to check the number of roles and groups.

44.1.23 Number of Roles

The number of roles defined for Enterprise Manager.

44.1.23.1 Data Source

The mgmt_roles table in the Management Repository.

44.1.23.2 User Action

If you have a problem viewing the All Targets page, you may want to check the number of roles and groups.

44.1.24 Number of Targets

The number of targets defined for Enterprise Manager.

44.1.24.1 Data Source

The mgmt_targets table in the Management Repository.

44.1.24.2 User Action

Specific to your site.

44.1.25 Number of Users

The number of users defined for Enterprise Manager.

44.1.25.1 Data Source

The sys.dba_role_privs table in the Management Repository.

44.1.25.2 User Action

Specific to your site.

44.1.26 Oldest Loader File

This metric shows how long the loader file has been waiting to be processed by the loader. This is an indicator of the delay from when the Management Agent sends out information to when the user receives the information.

44.1.26.1 Data Source

This metric is obtained using the following query of the mgmt_oms_parameters table in the Management Repository.

SELECT value FROM mgmt_oms_parameters
  where name='loaderOldestFile'

44.1.26.2 User Action

If the oldest loader file is extremely old, you have a loader problem. You may want to add another Management Service and point some of the Management Agents to the new Management Service.

44.1.27 Repository Tablespace Used

This is the total number of MB that the Management Repository tablespaces are currently using.

44.1.27.1 Data Source

The dba_data_files table in the Management Repository.

44.1.27.2 User Action

This metric is informational only.

44.1.28 Session Count

A count of the number of sessions between the Management Service and Management Repository database.

44.1.28.1 Data Source

The v$session system view.

44.1.28.2 User Action

This metric is informational only.

44.1.29 Status since

Timestamp of when the Management Service was marked up or down.

44.1.29.1 Data Source

The mgmt_oms_parameters table in the Management Repository.

44.1.29.2 User Action

Specific to your site.

44.1.30 Steps Per Second

The number of job steps processed per second by the job dispatcher, averaged over the past hour and sampled every 10 minutes.

44.1.30.1 Data Source

The mgmt_system_performance_log table in the Management Repository.

44.1.30.2 User Action

Specific to your site.

44.1.31 Targets not providing data

This metric provides a count of the targets that are not uploading data.

44.1.31.1 Data Source

The mgmt_targets, mgmt_current_availability tables in the Management Repository.

44.1.31.2 User Action

This metric is informational only.

44.1.32 Throughput Per Second

The number of notifications delivered per second, averaged over the past hour.

44.1.32.1 Data Source

The mgmt_system_performance_log table in the Management Repository.

44.1.32.2 User Action

This metric is informational only.

44.1.33 Total Loader Runtime in the Last Hour

This is the amount of time in milliseconds that the loader thread has been running in the past hour.

44.1.33.1 Data Source

The mgmt_system_performance_log table in the Management Repository.

44.1.33.2 User Action

If this number is steadily increasing along with the Loader Throughput (rows per hour) metric, then perform the actions described in the User Action section of the help topic for the Loader Throughput (rows per hour) metric. If this number increases but the loader throughput does not, check for resource constraints, such as high CPU utilization by some process, deadlocks in the Management Repository database, or processor memory problems.

44.1.34 Total Repository Tablespace

The total MB allocated to the Management Repository tablespaces. This will always be greater than or equal to the space used.

44.1.34.1 Data Source

The dba_free_space table in the Management Repository.

44.1.34.2 User Action

This metric is informational only.

44.2 Response

This page indicates whether Enterprise Manager is up or down. It contains historical information for periods in which it was down.

44.2.1 Status

This metric indicates whether the Management Service is up or down. If you cannot access the Management Repository, you will get an out of band error.

44.2.1.1 Metric Summary

The following table shows how often the metric's value is collected and compared against the default thresholds. The 'Consecutive Number of Occurrences Preceding Notification' column indicates the consecutive number of times the comparison against thresholds should hold TRUE before an alert is generated.

Table 44-1 Metric Summary Table

Target Version Evaluation and Collection Frequency Upload Frequency Operator Default Warning Threshold Default Critical Threshold Consecutive Number of Occurrences Preceding Notification Alert Text
All Versions Every 5 Minutes Not Uploaded =
Not Defined 0 1 %Message%

44.2.1.2 Data Source

sysman/admin/scripts/emrepresp.pl

44.2.1.3 User Action

This metric checks for the following:

  • Is the Management Repository database up and accessible?

    If the Management Repository database is down, start it.

  • Is at least one Management Service running?

    If a Management Service is not running, start one.