http://www7b.software.ibm.com/dmdd/library/techarticle/0208kuznetsov/0208kuznetsov.html 公式なページ http://www-3.ibm.com/cgi-bin/db2www/data/db2/udb/winos2unix/support/v8infocenter.d2w/report?target=mainFrame&fn=ciiperft.htm 再計算の回避 巨大な表に対してSUM(...)を行う場合はインデックスよりも MQT(Materialized Query Table/旧summary tables)を作成したほうが 劇的にパフォーマンスが改善する場合がある。 要するに、複数回同じような抽出を行うロジックがあるとして 対象データをいったんMQTに取得して MQTを複数回検索するテクニック。 select SUM(AMOUNT), trans_dt from CUSTOMER_ORDER where trans_dt between '1/1/2001' and '12/31/2001' group by trans_dt MQTを作成する。 CREATE TABLE DB2INST2.SUMMARY_CUSTOMER_ORDER_2001 AS (SELECT SUM(AMOUNT) AS TOTAL_SUM, TRANS_DT, STATUS FROM DB2INST2.CUSTOMER_ORDER WHERE TRANS_DT BETWEEN '1/1/2001' AND '12/31/2001' GROUP BY TRANS_DT, STATUS) DATA INITIALLY DEFERRED REFRESH DEFERRED; DATA INITIALLY DEFERREDの意味はCREATE TABLE時にデータを抽出させない指定。 アプリケーションでデータが必要なときにREFRESH TABLEする。 REFRESH TABLE SUMMARY_CUSTOMER_ORDER_2001; テーブルスキャン/インデックススキャンの回避 頻繁にCOUNT(*)やSUM(AMOUNT)を発行するアプリケーションがあるとき MQTを作成することで 常に最新のMQTにCOUNT(*)やSUM(AMOUNT)が入っている状態にできる。 CREATE TABLE CUSTOMER_ORDER_SUMMER AS ( SELECT COUNT(*) AS CNT_ALL, COUNT(AMOUNT) AS CNT, SUM(AMOUNT) AS TOTAL_SUM, ORDER_DT FROM CUSTOMER_ORDER WHERE ORDER_DT BETWEEN '7/1/2003' AND '9/30/2003' GROUP BY ORDER_DT ) DATA INITIALLY DEFERRED REFRESH IMMEDIATE ENABLE QUERY OPTIMIZATION REFRESH IMMEDIATE 元の表が更新されるとMQTもREFRESHされる指定。 元の表にDELETEされたとき DELETE FROM CUSTOMER_ORDER WHERE TRANS_DT = ‘1/1/2002’; DB2は特定の日付のレコードが消えたことのを見つけて MQTの依存するレコードを削除する。 MQTに最新のCOUNTやSUMが入っているので オプティマイザはMQTから結果だけを拾ってくるように最適化する。 めっちゃ条件が厳しい。 少しでも条件をはみ出ると 「SQL20058N マテリアライズ照会表 "SCS1.CUSTOMER_ORDER_SUMMER"に指定された全選択は無効です。 SQLSTATE=428EC」 が発生する。 http://www-3.ibm.com/cgi-bin/db2www/data/db2/udb/winos2unix/support/v8document.d2w/report?fn=r0000927.htm When REFRESH IMMEDIATE is specified: * the fullselect must be a subselect * the fullselect cannot include a reference to a nickname (SQLSTATE 428EC) * the subselect cannot include: o functions that are not deterministic o scalar fullselects o predicates with fullselects o special registers * a GROUP BY clause must be included in the subselect unless REPLICATED is specified, in which case a GROUP BY clause is not allowed. * The supported column functions are SUM, COUNT, COUNT_BIG and GROUPING (without DISTINCT). The select list must contain a COUNT(*) or COUNT_BIG(*) column. If the materialized query table select list contains SUM(X), where X is a nullable argument, the materialized query table must also have COUNT(X) in its select list. These column functions cannot be part of any expressions. * if the FROM clause references more than one table or view, it can only define an inner join without using the explicit INNER JOIN syntax * all GROUP BY items must be included in the select list * GROUPING SETS, CUBE and ROLLUP are supported. The GROUP BY items and associated GROUPING column functions in the select list must form a unique key of the result set. Thus, the following restrictions must be satisfied: o no grouping sets may be repeated. For example, ROLLUP(X,Y), X is not allowed because it is equivalent to GROUPING SETS((X,Y),(X),(X)) o if X is a nullable GROUP BY item that appears within GROUPING SETS, CUBE, or ROLLUP, then GROUPING(X) must appear in the select list o grouping on constants is not allowed * a HAVING clause is not allowed * if in a multiple partition database partition group, then either the partitioning key must be a subset of the GROUP BY items, or REPLICATED must be specified * if REPLICATED is specified, the table must have a unique key 別紙の検証参照。 MQTの複製を使うことでブロードキャストの回避 マルチパーティション構成時にブロードキャストを回避できる。 REFRESH IMMEDIATE と REFRESH DEFERRED REFRESH IMMEDIATEの設定のMQTは以下のようなメリット/デメリットがある。 * Speed up the performance of relevant select statements. * Are automatically chosen by the optimizer whenever it makes sense. * Can degrade the performance of insert, update and delete statements. * Cannot be updated directly. * May occupy considerable disk space. * May have exclusive locks held during updates of their base tables. オプティマイザがベース表の代わりにMQTを使うような設定 通常の最適化レベル(5)ではオプティマイザはMQTを使った検索を採用する。 正しく最適化させるためにベースの表とMQTとインデックスの統計情報は必要。 CURRENT QUERY OPTIMIZATIONの設定 1 Restricted optimization. Useful when memory and processing resources are severely restrained. Roughly equivalent to the optimization provided by Version 1. 2 Slight optimization. Specifies a level of optimization higher than that of Version 1, but at significantly less optimization cost than levels 3 and above, especially for very complex queries. 3 Moderate optimization. Comes closest to matching the query optimization characteristics of DB2 for MVS/ESA. 5 Normal optimization. Recommended for a mixed environment using both simple transactions and complex queries. 7 Normal optimization. The same as query optimization 5 except that it does not reduce the amount of query optimization for complex dynamic SQL queries. 0 Minimal optimization. Use only when little or no optimization is required (that is, for very simple queries on well-indexed tables). 9 Maximum optimization. Uses substantial memory and processing resources. Use only if class 5 is insufficient (that is, for very complex and long-running queries that do not perform well at class 5). In general, use a higher optimization class for static queries and for queries that you anticipate will take a long time to execute, and a lower optimization class for simple queries that are submitted dynamically or that are run only a few times. To set the query optimization for dynamic SQL statements, enter the following command in the Command Line Processor: SET CURRENT QUERY OPTIMIZATION = n; End of FILE.