MQTによるパフォーマンス改善
http://www7b.software.ibm.com/dmdd/library/techarticle/0208kuznetsov/0208kuznetsov.html
公式なページ
http://www-3.ibm.com/cgi-bin/db2www/data/db2/udb/winos2unix/support/v8infocenter.d2w/report?target=mainFrame&fn=ciiperft.htm
再計算の回避
巨大な表に対してSUM(...)を行う場合はインデックスよりも
MQT(Materialized Query Table/旧summary tables)を作成したほうが
劇的にパフォーマンスが改善する場合がある。
要するに、複数回同じような抽出を行うロジックがあるとして
対象データをいったんMQTに取得して
MQTを複数回検索するテクニック。
select SUM(AMOUNT), trans_dt
from CUSTOMER_ORDER
where trans_dt between '1/1/2001' and '12/31/2001'
group by trans_dt
MQTを作成する。
CREATE TABLE DB2INST2.SUMMARY_CUSTOMER_ORDER_2001 AS
(SELECT SUM(AMOUNT) AS TOTAL_SUM,
TRANS_DT,
STATUS
FROM DB2INST2.CUSTOMER_ORDER
WHERE TRANS_DT BETWEEN '1/1/2001' AND '12/31/2001'
GROUP BY TRANS_DT,
STATUS)
DATA INITIALLY DEFERRED REFRESH DEFERRED;
DATA INITIALLY DEFERREDの意味はCREATE TABLE時にデータを抽出させない指定。
アプリケーションでデータが必要なときにREFRESH TABLEする。
REFRESH TABLE SUMMARY_CUSTOMER_ORDER_2001;
テーブルスキャン/インデックススキャンの回避
頻繁にCOUNT(*)やSUM(AMOUNT)を発行するアプリケーションがあるとき
MQTを作成することで
常に最新のMQTにCOUNT(*)やSUM(AMOUNT)が入っている状態にできる。
CREATE TABLE CUSTOMER_ORDER_SUMMER AS
(
SELECT
COUNT(*) AS CNT_ALL,
COUNT(AMOUNT) AS CNT,
SUM(AMOUNT) AS TOTAL_SUM,
ORDER_DT
FROM CUSTOMER_ORDER
WHERE ORDER_DT BETWEEN '7/1/2003' AND '9/30/2003'
GROUP BY ORDER_DT
)
DATA INITIALLY DEFERRED REFRESH IMMEDIATE ENABLE QUERY OPTIMIZATION
REFRESH IMMEDIATE
元の表が更新されるとMQTもREFRESHされる指定。
元の表にDELETEされたとき
DELETE FROM CUSTOMER_ORDER WHERE TRANS_DT = ‘1/1/2002’;
DB2は特定の日付のレコードが消えたことのを見つけて
MQTの依存するレコードを削除する。
MQTに最新のCOUNTやSUMが入っているので
オプティマイザはMQTから結果だけを拾ってくるように最適化する。
めっちゃ条件が厳しい。
少しでも条件をはみ出ると
「SQL20058N マテリアライズ照会表 "SCS1.CUSTOMER_ORDER_SUMMER"に指定された全選択は無効です。 SQLSTATE=428EC」
が発生する。
http://www-3.ibm.com/cgi-bin/db2www/data/db2/udb/winos2unix/support/v8document.d2w/report?fn=r0000927.htm
When REFRESH IMMEDIATE is specified:
* the fullselect must be a subselect
* the fullselect cannot include a reference to a nickname (SQLSTATE 428EC)
* the subselect cannot include:
o functions that are not deterministic
o scalar fullselects
o predicates with fullselects
o special registers
* a GROUP BY clause must be included in the subselect unless REPLICATED is specified, in which case a GROUP BY clause is not allowed.
* The supported column functions are SUM, COUNT, COUNT_BIG and GROUPING (without DISTINCT).
The select list must contain a COUNT(*) or COUNT_BIG(*) column. If the materialized query table select list contains SUM(X),
where X is a nullable argument, the materialized query table must also have COUNT(X) in its select list.
These column functions cannot be part of any expressions.
* if the FROM clause references more than one table or view, it can only define an inner join without using the explicit INNER JOIN syntax
* all GROUP BY items must be included in the select list
* GROUPING SETS, CUBE and ROLLUP are supported. The GROUP BY items
and associated GROUPING column functions in the select list must form a unique key of the result set.
Thus, the following restrictions must be satisfied:
o no grouping sets may be repeated. For example, ROLLUP(X,Y), X is not allowed because it is equivalent to GROUPING SETS((X,Y),(X),(X))
o if X is a nullable GROUP BY item that appears within GROUPING SETS, CUBE, or ROLLUP, then GROUPING(X) must appear in the select list
o grouping on constants is not allowed
* a HAVING clause is not allowed
* if in a multiple partition database partition group, then either the partitioning key must be a subset of the GROUP BY items,
or REPLICATED must be specified
* if REPLICATED is specified, the table must have a unique key
別紙の検証参照。
MQTの複製を使うことでブロードキャストの回避
マルチパーティション構成時にブロードキャストを回避できる。
REFRESH IMMEDIATE と REFRESH DEFERRED
REFRESH IMMEDIATEの設定のMQTは以下のようなメリット/デメリットがある。
* Speed up the performance of relevant select statements.
* Are automatically chosen by the optimizer whenever it makes sense.
* Can degrade the performance of insert, update and delete statements.
* Cannot be updated directly.
* May occupy considerable disk space.
* May have exclusive locks held during updates of their base tables.
オプティマイザがベース表の代わりにMQTを使うような設定
通常の最適化レベル(5)ではオプティマイザはMQTを使った検索を採用する。
正しく最適化させるためにベースの表とMQTとインデックスの統計情報は必要。
CURRENT QUERY OPTIMIZATIONの設定
1 Restricted optimization. Useful when memory and processing resources are severely restrained.
Roughly equivalent to the optimization provided by Version 1.
2 Slight optimization. Specifies a level of optimization higher than that of Version 1,
but at significantly less optimization cost than levels 3 and above, especially for very complex queries.
3 Moderate optimization. Comes closest to matching the query optimization characteristics of DB2 for MVS/ESA.
5 Normal optimization. Recommended for a mixed environment using both simple transactions and complex queries.
7 Normal optimization. The same as query optimization 5
except that it does not reduce the amount of query optimization for complex dynamic SQL queries.
0 Minimal optimization. Use only when little or no optimization is required
(that is, for very simple queries on well-indexed tables).
9 Maximum optimization. Uses substantial memory and processing resources.
Use only if class 5 is insufficient (that is, for very complex and long-running queries that do not perform well at class 5).
In general, use a higher optimization class for static queries and for queries that you anticipate will take a long time to execute,
and a lower optimization class for simple queries that are submitted dynamically or that are run only a few times.
To set the query optimization for dynamic SQL statements, enter the following command in the Command Line Processor:
SET CURRENT QUERY OPTIMIZATION = n;
End of FILE.
2003/08/25-27
ugya@lycos.com