Friday, January 8, 2016

Efficient SQL Statements

Driving Tables (RBO Only)
The structure of the FROM and WHERE clauses of DML statements can be tailored to improve the performance of the statement. The rules vary depending on whether the database engine is using the Rule or Cost based optimizer. The situation is further complicated by the fact that the engine may perform a Merge Join or a Nested Loop join to retrieve the data. Despite this, there are a few rules you can use to improve the performance of your SQL.
Oracle processes result sets a table at a time. It starts by retrieving all the data for the first (driving) table. Once this data is retrieved it is used to limit the number of rows processed for subsequent (driven) tables. In the case of multiple table joins, the driving table limits the rows processed for the first driven table. Once processed, this combined set of data is the driving set for the second driven table etc. Roughly translated into English, this means that it is best to process tables that will retrieve a small number of rows first. The optimizer will do this to the best of it's ability regardless of the structure of the DML, but the following factors may help.
Both the Rule and Cost based optimizer select a driving table for each query. If a decision cannot be made, the order of processing is from the end of the FROM clause to the start. Therefore, you should always place your driving table at the end of the FROM clause. Subsequent driven tables should be placed in order so that those retrieving the most rows are nearer to the start of the FROM clause. Confusingly, the WHERE clause should be written in the opposite order, with the driving tables conditions first and the final driven table last. i.e.
FROM  d, c, b, a
WHERE a.join_column = 12345
AND   a.join_column = b.join_column
AND   b.join_column = c.join_column
AND   c.join_column = d.join_column;
If we now want to limit the rows brought back from the "D" table we may write the following.
FROM  d, c, b, a
WHERE a.join_column = 12345
AND   a.join_column = b.join_column
AND   b.join_column = c.join_column
AND   c.join_column = d.join_column
AND   d.name = 'JONES';
Depending on the number of rows and the presence of indexes, Oracle my now pick "D" as the driving table. Since "D" now has two limiting factors (join_column and name), it may be a better candidate as a driving table so the statement may be better written as follows.
FROM  c, b, a, d
WHERE d.name = 'JONES'
AND   d.join_column = 12345
AND   d.join_column = a.join_column
AND   a.join_column = b.join_column
AND   b.join_column = c.join_column
This grouping of limiting factors will guide the optimizer more efficiently making table "D" return relatively few rows, and so make it a more efficient driving table.
Remember, the order of the items in both the FROM and WHERE clause will not force the optimizer to pick a specific table as a driving table, but it may influence it's decision. The grouping of limiting conditions onto a single table will reduce the number of rows returned from that table, and will therefore make it a stronger candidate for becoming the driving table.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.