Visualizing Statspack “Average Active Sessions” in SQL Developer

This is the second post in my mini-series on leveraging SQL Developer Reports for DBA tasks, today with visualizing Average Active Sessions (AAS). In this article I’ll cover

  • What AAS is and how to interpret it
  • How to build a basic line graph in SQL Developer
  • How to extend the graph with detailed child reports (Time Model Statistics and Top Events)

What’s that?

As I mentioned in an earlier post, AAS may well be the single most important DB load metric and a good starting point for a drill-down. AAS can be seen as an equivalent to the CPU Load metric on Unices/Linuxes. It basically tells you

How many database sessions are actively waiting for work being processed by the database?

Using this metric, you can tell

  • whether your DB is under light or heavy load;
  • whether there are patterns in your DB load;
  • whether there are peaks or trends requiring further attention (i.e. drill-down).

AAS was introduced as an ASH metric in Oracle 11. Before that it could be simply derived as

AAS = Database Time / Wall Clock Time

With “Wall Clock Time” meaning the time that passed inbetween two snapshots. There’s no AAS metric in Statspack, therefore we’ll have to use the above formula.

 Defining The Report

From the first article in this mini-series you already know how to build a report. I don’t want to bore you, so this time we’ll add some detail to it. The final report should display:

  • AAS
  • The CPU Load of the DB host, if available
  • The number of logical CPUs as a threshold
  • Time Model Statistics for a selected snapshot
  • Top 5 Events for a selected snapshot (similar to a Statspack report)

<<Wait a minute – “Logical” CPUs? What does that mean?>>

Oracle has several metrics to choose from when it comes to finding out how many CPUs there are on the DB host. Depending on CPU architecture, you might have a “monolithic” CPU, a multi-core CPU or a CPU running several Hyperthreads per core (e.g. Intel Xeon). As one database process can run on one CPU/core/thread at a time, Oracle abstracts the finest granule into a “logical CPU”, its number can be queried from the data dictionary.

‘Nuff said, let’s fire up sqldev, go to the “Reports” tab an create a new user report: 10_new_report

and set the initial properties for the report – choose “Graph” as report type.

In the “SQL” textarea enter the following:

-- Get Logical CPUs and OS CPU Load
SELECT TO_CHAR( sn.snap_time, 'dd.mm. hh24:mi' ) time
     , CASE nm.stat_name
       WHEN 'NUM_CPUS' THEN 'Log. CPUs'
       WHEN 'LOAD' THEN 'OS Load'
       ELSE nm.stat_name
       END AS stat_name
     , os.value
     , sn.snap_time
  FROM stats$osstat os
     , stats$osstatname nm
     , stats$snapshot sn
 WHERE os.osstat_id = nm.osstat_id
   AND nm.stat_name IN( 'NUM_CPUS', 'LOAD' )
   AND sn.snap_id = os.snap_id
   AND sn.instance_number = ( SELECT instance_number FROM v$instance )
   AND os.instance_number = sn.instance_number
UNION ALL
-- Get AAS (= DB Time / Wall Clock Time)
-- Wait times are measured in microseconds
SELECT time
     , 'AAS' AS stat_name
     , value_diff / DECODE( time_diff, 0, NULL, time_diff ) / 24 / 3600 / 1000000 AS aas
     , snap_time
FROM
(
  SELECT TO_CHAR( sn.snap_time, 'dd.mm. hh24:mi' ) time
       , ( LEAD( tm.value, 1 ) OVER( ORDER BY tm.snap_id ) - tm.value ) AS value_diff
       , ( LEAD( sn.snap_time, 1 ) OVER( ORDER BY sn.snap_id ) - sn.snap_time ) AS time_diff
       , sn.snap_time
    FROM stats$sys_time_model tm
       , stats$time_model_statname nm
       , stats$snapshot sn
   WHERE tm.stat_id = nm.stat_id
     AND nm.stat_name = 'DB time'
     AND sn.snap_id = tm.snap_id
     AND sn.instance_number = ( SELECT instance_number FROM v$instance )
     AND tm.instance_number = sn.instance_number
)
 ORDER BY snap_time, stat_name

Now the SQL is entered, choose an appropriate graph type. In our case, all metrics can be displayed as separate lines. Choose “Line” as the graph type.

Let’s give our report a self-explanatory title and check the results:

AAS_final

This doesn’t seem to be a particularly busy database, as the number of average active sessions is far below the number of available logical CPUs. If it were high above the log. CPU mark, then this would indicate an overloaded system with the potential of quickly rising response times.

On the other hand, if there’s only one AAS and this session runs for hours instead of seconds, there’s probably still a performance problem lurking around.

So – can we have some more detail, please? Yes, by using

Child Reports

Right-click to edit the report and highlight “Child Reports” from the tree. Press “Add” and name that child “Time Model”:

AAS_10_add_child

Then, choose “Table” as the style:

AAS_12_child_report_table

And enter this SQL to query the Time Model Stats for the selected snapshot time:

SELECT snap_time ||' - '|| SUBSTR( nextt, -5 ) AS interval
     , stat_name
     , ROUND( dv / DECODE( dt, 0, NULL, dt ) / 24 / 3600 / 1000000, 2 ) aas
FROM (
SELECT tm.snap_id, sn.snap_time, nm.stat_name
     , tm.VALUE
     , LEAD( tm.value, 1 ) OVER( PARTITION BY tm.stat_id ORDER BY tm.snap_id ) nextv
     , LEAD( tm.value, 1 ) OVER( PARTITION BY tm.stat_id ORDER BY tm.snap_id )- tm.value dv
     , LEAD( sn.snap_time, 1 ) OVER( PARTITION BY tm.stat_id ORDER BY tm.snap_id ) nextt
     , LEAD( sn.snap_time, 1 ) OVER( PARTITION BY tm.stat_id ORDER BY tm.snap_id ) - sn.snap_time AS dt
  FROM STATS$TIME_MODEL_STATNAME nm
     , STATS$SYS_TIME_MODEL tm
     , STATS$SNAPSHOT sn
 WHERE tm.STAT_ID = nm.STAT_ID
   AND tm.SNAP_ID = sn.SNAP_ID
   AND sn.instance_number = ( SELECT instance_number FROM v$instance )
   AND TO_CHAR( sn.snap_time, 'dd.mm. hh24:mi' ) >= :TIME
)
 WHERE TO_CHAR( snap_time, 'dd.mm. hh24:mi' ) = :TIME
   AND ROUND( dv / DECODE( dt, 0, NULL, dt ) / 24 / 3600 / 1000000, 2 ) > 0
 ORDER BY aas DESC, stat_name

Notice the bind variable “:TIME”: This variable references the column alias “time” from our main report’s SQL. “snap_id” would be a much better key to join to, but we haven’t selected it in our main report (if we had, it would have been displayed somewhere which is awkward). It’s a peculiarity of SQL Developer that those reference variables have to be written in UPPERCASE. Keep that in mind when writing your own report.

And here’s our Time Model child report for that little peak in the AAS graph:

AAS_detail01_time_model

Of much higher interest (in my humble opinion) is the resource consumption. Statspack saves the counters for all events, thus it’s a snap to find the top 5 events for a snapshot interval.

Create another child report “Top Events” with the following SQL:

SELECT * FROM(
  SELECT TO_CHAR(snap_time, 'dd.mm. hh24:mi')
         ||' - '|| TO_CHAR( LEAD( snap_time, 1 ) OVER( PARTITION BY event ORDER BY snap_id ), 'hh24:mi' ) AS interval
       , event
       , ROUND(( LEAD( time_spent, 1 ) OVER( PARTITION BY event ORDER BY snap_id ) - time_spent ) /
         ( LEAD( snap_time, 1 ) OVER( PARTITION BY event ORDER BY snap_id ) - snap_time ) / 24 ) seconds_waited
    FROM(
    SELECT    snap_id,
              snap_time,
              event,
              time_waited_micro / 1000000 AS time_spent
       FROM   stats$snapshot NATURAL JOIN stats$system_event NATURAL JOIN v$event_name
      WHERE   TRUNC( snap_time, 'mi' ) BETWEEN TO_DATE( :TIME, 'dd.mm. hh24:mi' )
              AND ( SELECT MIN( snap_time ) FROM stats$snapshot WHERE TRUNC( snap_time, 'mi' ) > TO_DATE( :TIME, 'dd.mm. hh24:mi' ) )
        AND   wait_class# != 6
        AND   instance_number = ( SELECT instance_number FROM v$instance )
   
              UNION ALL
   
     SELECT   snap_id,
              snap_time,
              'CPU used by this session' AS event,
              value / 100 as time_spent
       FROM   stats$sysstat NATURAL JOIN stats$snapshot
      WHERE   TRUNC( snap_time, 'mi' ) BETWEEN TO_DATE( :TIME, 'dd.mm. hh24:mi' )
              AND ( SELECT MIN( snap_time ) FROM stats$snapshot WHERE TRUNC( snap_time, 'mi' ) > TO_DATE( :TIME, 'dd.mm. hh24:mi' ) )
        AND   name = 'CPU used by this session'
        AND   instance_number = ( SELECT instance_number FROM v$instance )
  )
  ORDER BY seconds_waited DESC NULLS LAST
)
WHERE ROWNUM < 6

Which gives us this output for the selected peak:

AAS_detail02_top_events

At this point you might consider saving this report. The saved XML file can then be transferred to another computer / SQL Developer installation or shared publicly as I do in my SQL Developer Repository on GitHub. So, in case you don’t want to walk through this article, feel free to download this report from there. And if you’d like to add some enhancements to this report (like Top SQL), feel free to contribute!

Do you have questions or suggestions? Then let me know in the comments section below!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s