This is the second post in my mini-series on leveraging SQL Developer Reports for DBA tasks, today with visualizing Average Active Sessions (AAS). In this article I’ll cover
- What AAS is and how to interpret it
- How to build a basic line graph in SQL Developer
- How to extend the graph with detailed child reports (Time Model Statistics and Top Events)
As I mentioned in an earlier post, AAS may well be the single most important DB load metric and a good starting point for a drill-down. AAS can be seen as an equivalent to the CPU Load metric on Unices/Linuxes. It basically tells you
How many database sessions are actively waiting for work being processed by the database?
Using this metric, you can tell
- whether your DB is under light or heavy load;
- whether there are patterns in your DB load;
- whether there are peaks or trends requiring further attention (i.e. drill-down).
AAS was introduced as an ASH metric in Oracle 11. Before that it could be simply derived as
AAS = Database Time / Wall Clock Time
With “Wall Clock Time” meaning the time that passed inbetween two snapshots. There’s no AAS metric in Statspack, therefore we’ll have to use the above formula.
Defining The Report
From the first article in this mini-series you already know how to build a report. I don’t want to bore you, so this time we’ll add some detail to it. The final report should display:
- The CPU Load of the DB host, if available
- The number of logical CPUs as a threshold
- Time Model Statistics for a selected snapshot
- Top 5 Events for a selected snapshot (similar to a Statspack report)
<<Wait a minute – “Logical” CPUs? What does that mean?>>
Oracle has several metrics to choose from when it comes to finding out how many CPUs there are on the DB host. Depending on CPU architecture, you might have a “monolithic” CPU, a multi-core CPU or a CPU running several Hyperthreads per core (e.g. Intel Xeon). As one database process can run on one CPU/core/thread at a time, Oracle abstracts the finest granule into a “logical CPU”, its number can be queried from the data dictionary.
and set the initial properties for the report – choose “Graph” as report type.
In the “SQL” textarea enter the following:
-- Get Logical CPUs and OS CPU Load SELECT TO_CHAR( sn.snap_time, 'dd.mm. hh24:mi' ) time , CASE nm.stat_name WHEN 'NUM_CPUS' THEN 'Log. CPUs' WHEN 'LOAD' THEN 'OS Load' ELSE nm.stat_name END AS stat_name , os.value , sn.snap_time FROM stats$osstat os , stats$osstatname nm , stats$snapshot sn WHERE os.osstat_id = nm.osstat_id AND nm.stat_name IN( 'NUM_CPUS', 'LOAD' ) AND sn.snap_id = os.snap_id AND sn.instance_number = ( SELECT instance_number FROM v$instance ) AND os.instance_number = sn.instance_number UNION ALL -- Get AAS (= DB Time / Wall Clock Time) -- Wait times are measured in microseconds SELECT time , 'AAS' AS stat_name , value_diff / DECODE( time_diff, 0, NULL, time_diff ) / 24 / 3600 / 1000000 AS aas , snap_time FROM ( SELECT TO_CHAR( sn.snap_time, 'dd.mm. hh24:mi' ) time , ( LEAD( tm.value, 1 ) OVER( ORDER BY tm.snap_id ) - tm.value ) AS value_diff , ( LEAD( sn.snap_time, 1 ) OVER( ORDER BY sn.snap_id ) - sn.snap_time ) AS time_diff , sn.snap_time FROM stats$sys_time_model tm , stats$time_model_statname nm , stats$snapshot sn WHERE tm.stat_id = nm.stat_id AND nm.stat_name = 'DB time' AND sn.snap_id = tm.snap_id AND sn.instance_number = ( SELECT instance_number FROM v$instance ) AND tm.instance_number = sn.instance_number ) ORDER BY snap_time, stat_name
Now the SQL is entered, choose an appropriate graph type. In our case, all metrics can be displayed as separate lines. Choose “Line” as the graph type.
Let’s give our report a self-explanatory title and check the results:
This doesn’t seem to be a particularly busy database, as the number of average active sessions is far below the number of available logical CPUs. If it were high above the log. CPU mark, then this would indicate an overloaded system with the potential of quickly rising response times.
On the other hand, if there’s only one AAS and this session runs for hours instead of seconds, there’s probably still a performance problem lurking around.
So – can we have some more detail, please? Yes, by using
Right-click to edit the report and highlight “Child Reports” from the tree. Press “Add” and name that child “Time Model”:
Then, choose “Table” as the style:
And enter this SQL to query the Time Model Stats for the selected snapshot time:
SELECT snap_time ||' - '|| SUBSTR( nextt, -5 ) AS interval , stat_name , ROUND( dv / DECODE( dt, 0, NULL, dt ) / 24 / 3600 / 1000000, 2 ) aas FROM ( SELECT tm.snap_id, sn.snap_time, nm.stat_name , tm.VALUE , LEAD( tm.value, 1 ) OVER( PARTITION BY tm.stat_id ORDER BY tm.snap_id ) nextv , LEAD( tm.value, 1 ) OVER( PARTITION BY tm.stat_id ORDER BY tm.snap_id )- tm.value dv , LEAD( sn.snap_time, 1 ) OVER( PARTITION BY tm.stat_id ORDER BY tm.snap_id ) nextt , LEAD( sn.snap_time, 1 ) OVER( PARTITION BY tm.stat_id ORDER BY tm.snap_id ) - sn.snap_time AS dt FROM STATS$TIME_MODEL_STATNAME nm , STATS$SYS_TIME_MODEL tm , STATS$SNAPSHOT sn WHERE tm.STAT_ID = nm.STAT_ID AND tm.SNAP_ID = sn.SNAP_ID AND sn.instance_number = ( SELECT instance_number FROM v$instance ) AND TO_CHAR( sn.snap_time, 'dd.mm. hh24:mi' ) >= :TIME ) WHERE TO_CHAR( snap_time, 'dd.mm. hh24:mi' ) = :TIME AND ROUND( dv / DECODE( dt, 0, NULL, dt ) / 24 / 3600 / 1000000, 2 ) > 0 ORDER BY aas DESC, stat_name
Notice the bind variable “:TIME”: This variable references the column alias “time” from our main report’s SQL. “snap_id” would be a much better key to join to, but we haven’t selected it in our main report (if we had, it would have been displayed somewhere which is awkward). It’s a peculiarity of SQL Developer that those reference variables have to be written in UPPERCASE. Keep that in mind when writing your own report.
And here’s our Time Model child report for that little peak in the AAS graph:
Of much higher interest (in my humble opinion) is the resource consumption. Statspack saves the counters for all events, thus it’s a snap to find the top 5 events for a snapshot interval.
Create another child report “Top Events” with the following SQL:
SELECT * FROM( SELECT TO_CHAR(snap_time, 'dd.mm. hh24:mi') ||' - '|| TO_CHAR( LEAD( snap_time, 1 ) OVER( PARTITION BY event ORDER BY snap_id ), 'hh24:mi' ) AS interval , event , ROUND(( LEAD( time_spent, 1 ) OVER( PARTITION BY event ORDER BY snap_id ) - time_spent ) / ( LEAD( snap_time, 1 ) OVER( PARTITION BY event ORDER BY snap_id ) - snap_time ) / 24 ) seconds_waited FROM( SELECT snap_id, snap_time, event, time_waited_micro / 1000000 AS time_spent FROM stats$snapshot NATURAL JOIN stats$system_event NATURAL JOIN v$event_name WHERE TRUNC( snap_time, 'mi' ) BETWEEN TO_DATE( :TIME, 'dd.mm. hh24:mi' ) AND ( SELECT MIN( snap_time ) FROM stats$snapshot WHERE TRUNC( snap_time, 'mi' ) > TO_DATE( :TIME, 'dd.mm. hh24:mi' ) ) AND wait_class# != 6 AND instance_number = ( SELECT instance_number FROM v$instance ) UNION ALL SELECT snap_id, snap_time, 'CPU used by this session' AS event, value / 100 as time_spent FROM stats$sysstat NATURAL JOIN stats$snapshot WHERE TRUNC( snap_time, 'mi' ) BETWEEN TO_DATE( :TIME, 'dd.mm. hh24:mi' ) AND ( SELECT MIN( snap_time ) FROM stats$snapshot WHERE TRUNC( snap_time, 'mi' ) > TO_DATE( :TIME, 'dd.mm. hh24:mi' ) ) AND name = 'CPU used by this session' AND instance_number = ( SELECT instance_number FROM v$instance ) ) ORDER BY seconds_waited DESC NULLS LAST ) WHERE ROWNUM < 6
Which gives us this output for the selected peak:
At this point you might consider saving this report. The saved XML file can then be transferred to another computer / SQL Developer installation or shared publicly as I do in my SQL Developer Repository on GitHub. So, in case you don’t want to walk through this article, feel free to download this report from there. And if you’d like to add some enhancements to this report (like Top SQL), feel free to contribute!
Do you have questions or suggestions? Then let me know in the comments section below!