It's amazing how you end up spending hours on a secondary problem when you start investigating a completely different one.
I set out in pursue of an allegedly trivial problem with an apex page this morning and I found myself sucked into a vortex of high CPU usage caused by one of the oldest built-in apex reports.
I wanted to inspect the session state values of the failing page and this has always been simply a matter of opening the Active sessions apex report, look up the right session id and drill down to item values, however when I opened page 7 of application 4350 (Active Sessions report) the CPU of my virtual server was skyrocketing. This phenomenon was duly recorded by Amazon EC2's instance monitoring chart as shown below.
a CPU spike occurs when opening the Active Sessions report |
However it turned out this time the problem was not one of that kind.
So, after confirming that the problem was with this specific page while the rest of my applications were running normally, the next task was to identify which SQL statement was consuming all this CPU.
I ran a SQLDeveloper built-in report showing the top SQL by CPU and the following query turned out to be one of the most time consuming:
select "ID", "SESSION_OWNER", "CREATED_ON", "ITEM_CACHE_COUNT", "IP_ADDRESS", "MOST_RECENT_VIEW", "DISTINCT_APPLICATIONS", count(*) over () as apxws_row_cnt from ( select * from ( select /* APEX4350P7a */ /*+ FIRST_ROWS(21) */ id, session_owner, created_on, item_cache_count, ip_address, (select max(time_stamp) mt from WWV_FLOW_ACTIVITY_LOG l where l.SESSION_ID = s.id) most_recent_view, distinct_applications from ( select "WWV_FLOW_SESSIONS$".ID as "ID", cookie session_owner, created_on, count(*) item_cache_count, count(distinct(flow_id)) distinct_applications, "WWV_FLOW_SESSIONS$".remote_addr ip_address from "WWV_FLOW_DATA" "WWV_FLOW_DATA", "WWV_FLOW_SESSIONS$" "WWV_FLOW_SESSIONS$" where WWV_FLOW_DATA.FLOW_INSTANCE(+)=WWV_FLOW_SESSIONS$.ID and (:P7_OWNER is null or instr(upper(cookie),upper(:P7_OWNER))>0) and security_group_id = :flow_security_group_id group by "WWV_FLOW_SESSIONS$".ID, cookie, created_on, remote_addr ) s ) r ) r where rownum <= to_number(:APXWS_MAX_ROW_CNT) order by "MOST_RECENT_VIEW" DESC,"CREATED_ON" DESC
By examining the explain plan it was clear that for some reason Oracle was executing two FULL TABLE SCANs on WWV_FLOW_ACTIVITY_LOG1$ and WWV_FLOW_ACTIVITY_LOG2$, two tables merged inside the view WWV_FLOW_ACTIVITY_LOG and on these tables there were no indexes on column SESSION_ID.
The result was that the query took nearly 300 seconds to scan some 250,000 rows.
Now, what if I create a couple of indexes on those two tables?
CREATE INDEX "APEX_040000"."WWV_FLOW_ACTIVITY_LOG1$_IDX4" ON "APEX_040000"."WWV_FLOW_ACTIVITY_LOG1$" ( "SESSION_ID" );
CREATE INDEX "APEX_040000"."WWV_FLOW_ACTIVITY_LOG2$_IDX4" ON "APEX_040000"."WWV_FLOW_ACTIVITY_LOG2$" ( "SESSION_ID" );
Well, the result was encouraging, from 300 seconds to less than 2 seconds.
Clearly with the next apex upgrade, if things have not been fixed, I'll have to rebuild these indexes.
And now for the real questions:
did the missing indexes disappear from my installation for some reason?
Have they ever been there in the past?
Have they been removed or lost during the evolution of Apex?
Am I the only one having this problem?
Is anybody else using the Active Session report with some thousand active sessions?
Answers are welcome.
PS: Tyler (see the comments section) suggested to put a disclaimer at the top of this article, which I did, I also added that probably it might make sense to create these additional indexes only for the time needed and removing them thereafter.
7 comments:
I just checked my apex 4.1 installation and these indexes do not exist.
Table:
WWV_FLOW_ACTIVITY_LOG2$
Index:
WWV_FLOW_ACTIVITY_LOG2$_IDX1
Columns:
TIME_STAMP
Index:
WWV_FLOW_ACTIVITY_LOG2$_IDX2
Columns:
SECURITY_GROUP_ID
FLOW_ID
The same indexes and columns exist for WWV_FLOW_ACTIVITY_LOG1$. It is worth noting that the second index is new to 4.0, only the time_stamp column was indexed in 3.1 and 3.2.
It is interesting that these do not exists, but I wonder if they tried to limit the number of indexes because of the index maintenance involved with heavily used tables like this. Seems like an oversight though since you are going to want to access this data by session.
hi Tony,
thanks for your feedback.
I tried opening this page on apex.oracle.com that is supposedly crunching millions of page views per month, so I guess those two tables must contain millions of rows, and it took some 22 seconds to display the page.
I wonder if it's just the sheer power of the underlying server even if 22 seconds are not a negligible time for any query according to many Oracle gurus. I remember someone saying that anything running for more than 3 seconds must be regarded as slow...
At any rate, it's clear that there is some margin for improvement in the performance of this report, like it or not, 300 seconds even on XE are a very long time for a user to wait, more than enough to prepare and sip a coffee meanwhile ;-)
Thank you
Flavio
Flavio,
I think you should post a warning not to actually do this in production at the top of your post.
As I remoember it, there's an insert into this table(s) for every page view. Maintaining indexes is insanely expensive. For example, in the last benchmark I ran, adding 1 index to 1 column decreased the insert rate by a factor of 10, and that was on an Exadata X2-8 with 2 extra cells. Granted, this was an extreme insert + extreme query benchmark.
Anyway, your report may be faster, but you've possibly slowed down EVERY page view of every app. The data model was designed for the most important and common use-case.
Full table scans require good I/O throughput. EC2 may not be up to the task.
I'd love to see an active SQL monitoring report for both use cases, each with a hard parse, if you care to dig deeper.
Hi Tyler,
thanks for pointing out the risks, I added a disclaimer.
From Tom's articles about Exadata, it looks like the worst idea you can have on that platform is to create indexes :-D
Here we are at the antipodes, an XE database running on a small EC2 instance, it's like comparing a rocket to a bicycle (also in terms of price), I wouldn't be surprised to find out that adding an index to a "slow" server isn't doing so much damage as to add it to a super-fast server.
Ideally one should be able to create temporary indexes that are utilized just for the time necessary to perform whatever needs to be done and then they are removed (on a per-session or per-user basis, something like the global temporary tables).
If Oracle one day implements my idea, I want the royalties.
Thank you
Flavio
It's also worth noting that Apex 4.0 comes with 3 indexes on those two tables:
WWV_FLOW_ACTIVITY_LOGx$_IDX1
is on TIME_STAMP column
WWV_FLOW_ACTIVITY_LOGx$_IDX2
is on SECURITY_GROUP_ID and FLOW_ID
WWV_FLOW_ACTIVITY_LOGx$_IDX3
is on WEBSHEET_ID
where "x" is either 1 or 2.
Now, given the fact that I am not currently using websheet applications, the third index is totaly useless for me, I imagine that trading in that index for a new one on SESSION_ID would be a good deal.
Flavio,
Oracle is not storing NULLs in indexes-therefore if you are not using websheets you're not paying anything for that index.
Jan
Jan, you're right, that index is not going to make any real difference, so the swap isn't advantageous.
At any rate, the active session report has been very useful for me in many situations so having to wait like 15 minutes or more before getting to the page where the relevant session is listed is really a big waste of time, probably the best thing to do is to create the index temporarily and drop it when done, in the end once you know what to do, it takes very little time to set it up.
Thanks
Flavio
Post a Comment