Skip to main content

Ingres resource limiting : Part 1(Ingres Query Execution Plan)

There were complaints about having long delays,putting strain on the hardware.(Ingres II 2.6, Unix HP-UX)
The problem is that as there are more than 100 users working on the system concurrently, they did not know which report was hogging the system down.Since there are over 10 fully blown application and each one possesses more than 200 (2300 in total) reports it would be really hard identifying the actual SQL queries and optimizing them.
As the applications in question range from warehouse monitoring to bookings, the meaning of the 'report' term has a mixed meaning.There are reports that are produced in a weekly,monthly or annual basis and there are frequent printouts during a user's daily work with all sorts of details.

An intermediate solution was to capture the QEP for each report as it is generated and check it against some predefined values that set the threshold of the system. If the value generated by the query itself is above the threshold then a warning message would prompt the user to continue with the report generation or not. That was the requirement set by the client, that is, to not abort the query automatically but inform the user and let him decide as to continue or not.
The issue is with letting the user decide.If not,then by just setting the directive 'set maxio' to the threshold value, the ingres resource limiter would preemptively refuse to run the query if the optimizer estimates that the threshold has been reached.But this is not the case so we have to work around it by parsing the QEP output.

An example of a QEP dump is
QUERY PLAN 5,1, no timeout, of main query

FSM Join(col1)
Heap
Pages 1 Tups 156
D9 C40
/ \
Proj-rest Proj-rest
Sorted(eno) Sorted(eno)
Pages 1 Tups 156 Pages 1 Tups 156
D8 C1 D1 C1
/ /
arel brel
Hashed(col1) Isam(col1)
Pages 70 Tups 156 Pages 3 Tups 156


What we are looking for here is the string D9 C40 which tells that 9 disk IO resources are needed for the query to be executed.Let's say that the threshold value is 5,then 9>5 so the message is sent to the user's console.The good thing with the QEP is that the cost for each node is summed upped on the totals of the main node hence you look for parsing only the main node.

The directives 'set qep;set optimizeonly' allow for generating the query plan without continuing with executing the actual query.

The problem was that Ingres does not have a way of returning the values sought programatically so to be used by a function,as in this case,it just dumps the plan on screen.

So there had to be a C procedure be written which is called from inside the 4GL application.
The C procedure runs the 'report' command passing 'set qep;set optimizeonly' as parameters to the report so that the report's query is not executed. Instead the QEP is produced which the procedure scans for the IO value sought and subsequently passed to the 4GL application.
It also passes a few other values back such as a timestamp,possible error values etc which can be used for logging purposes.Plus it logs which report is running,since it is considered problematic,so that it can be analyzed and optimized.
For it to function realistically, correct statistics should have been obtained by running optimizedb, which should not be a problem as it is frequently run as a cron job.

The test was successful both on the test site and on the production side.
In certain occasions it produced results that indicate that if the report generation would go ahead it would put strain on the system and user's responded by canceling the report before it was actually generated.

An actual sample of the log table on site follows :


──────────────────────┬──────────────────────────┬─────────────────────────
│timestamp │result │custommessage
├────────────────────────────────────────────────────────────
│ 1213165816│ 76 │Report not cancelled
│ 1213168619│ 76 │Report not cancelled
│ 1213171751│ 76 │Report not cancelled
│ 1213256073│ 3926│Report cancelled
│ 1213257002│ 64 │Report not cancelled
│ 1213337320│ 64 │Report not cancelled
│ 1213682936│ 72 │Report not cancelled
│ 1213702385│ 679 │Report cancelled
│ 1213771574│ 64 │Report cancelled
│ 1213779268│ 64 │Report not cancelled
│ 1213780153│ 68 │Report not cancelled
│ 1213854779│ 76 │Report cancelled
Clearly a report requiring 3926 or 679 disk I/O during day time should not be run!
Especially the first one which is an annual statistics reports, but fortunately the user did cancel it since he was given the option.

Comments

Popular posts from this blog

Serverless JavaScript

We recently joined in an interesting two-hour long conversation about Serverless JavaScript led by Steve Faulkner of Bustle who answered questions on Bustle, the Shep framework, the mindset behind the AWS Lambda infrastructure, and related topics.

The discussion took place on the Sideway conversation-sharing platform on January 6th. Here we present the best takeaways from the session which really should be taken notice of by anyone working on AWS.

Steve Faulkner:
At Bustle we serve over 50 million unique readers per month through a "serverless" architecture based on AWS Lambda and Node.js.  Of course there are still servers but we don't manage them. This shift has allowed us to develop products faster and decreased the cost of our infrastructure. I'll answer any questions about how we made this transition and how it has worked out. I'll also discuss some of the tools and best practises including our open source framework shep

Eran Hammer:
When would you…

Export your Wunderlist tasks with XPath

As brought up in this ProductHunt thread, the news is that Wunderlist is going to be deprecated in favor of the new Microsoft To-Do note taking platform.

This is what Wunderlist support had to say in response to my inquiry on Wunderlist's future:

"Now that the next evolution of Wunderlist is here, in the form of Microsoft To-Do Preview (https://www.wunderlist.com/blog/...), Wunderlist will no longer receive any updates or bug fixes and will eventually be retired. It won’t happen in the next few months and we’ll be sure to give our users plenty of notice beforehand. In the meantime, you can continue to use Wunderlist normally. Of course, we’d also love for you to try To-Do and let us know how you like it – and how we can improve it. While Wunderlist will continue to exist alongside To-Do for the time being, support for Wunderlist will eventually be removed. Not to worry, though! We will inform all Wunderlist users prior to shutting down service. You'll have ample opport…

Google's Cloud Spanner To Settle the Relational vs NoSQL Debate?

Cloud Spanner is a new proposition for database as a service that emphatically offers "Relational with NoSQL scaling". Will Google come to dominate yet another market?

Once upon a time there was only one kind of database management system, the RDBMS, "R" for relational. Despite its resilience and trustworthiness, it had its shortcomings; it did not scale well, and the relational model it served proved inadequate in the dawn of the Big Data era for handling massive amounts of schema-less, unstructured data.
For this and a few other reasons, a new breed of DBMS's emerged, one that could handle the avalanche of big data, based on the notion of the key-value pair, and doing so by scaling horizontally. But, in order to become versatile, this new breed of management systems had to forgo the safety of the ACID and the cosiness of SQL, both long term partners of the relational model. full article on i-programmer