MQTによるパフォーマンス改善


http://www7b.software.ibm.com/dmdd/library/techarticle/0208kuznetsov/0208kuznetsov.html

公式なページ
http://www-3.ibm.com/cgi-bin/db2www/data/db2/udb/winos2unix/support/v8infocenter.d2w/report?target=mainFrame&fn=ciiperft.htm


再計算の回避


巨大な表に対してSUM(...)を行う場合はインデックスよりも
MQT(Materialized Query Table/旧summary tables)を作成したほうが
劇的にパフォーマンスが改善する場合がある。

要するに、複数回同じような抽出を行うロジックがあるとして
対象データをいったんMQTに取得して
MQTを複数回検索するテクニック。

select SUM(AMOUNT), trans_dt
  from CUSTOMER_ORDER
  where trans_dt between '1/1/2001' and '12/31/2001'
  group by trans_dt


MQTを作成する。

CREATE TABLE DB2INST2.SUMMARY_CUSTOMER_ORDER_2001 AS
(SELECT SUM(AMOUNT) AS TOTAL_SUM,
TRANS_DT,
STATUS
FROM DB2INST2.CUSTOMER_ORDER
WHERE TRANS_DT BETWEEN '1/1/2001' AND '12/31/2001'
GROUP BY TRANS_DT,
STATUS)
DATA INITIALLY DEFERRED REFRESH DEFERRED;

DATA INITIALLY DEFERREDの意味はCREATE TABLE時にデータを抽出させない指定。

アプリケーションでデータが必要なときにREFRESH TABLEする。

REFRESH TABLE SUMMARY_CUSTOMER_ORDER_2001;





テーブルスキャン/インデックススキャンの回避

頻繁にCOUNT(*)やSUM(AMOUNT)を発行するアプリケーションがあるとき
MQTを作成することで
常に最新のMQTにCOUNT(*)やSUM(AMOUNT)が入っている状態にできる。


CREATE TABLE CUSTOMER_ORDER_SUMMER AS
(
SELECT
COUNT(*) AS CNT_ALL,
COUNT(AMOUNT) AS CNT,
SUM(AMOUNT) AS TOTAL_SUM,
ORDER_DT
FROM CUSTOMER_ORDER
WHERE ORDER_DT BETWEEN '7/1/2003' AND '9/30/2003'
GROUP BY ORDER_DT
)
DATA INITIALLY DEFERRED REFRESH IMMEDIATE ENABLE QUERY OPTIMIZATION


REFRESH IMMEDIATE
	元の表が更新されるとMQTもREFRESHされる指定。

元の表にDELETEされたとき
DELETE FROM CUSTOMER_ORDER WHERE TRANS_DT = ‘1/1/2002’;

DB2は特定の日付のレコードが消えたことのを見つけて
MQTの依存するレコードを削除する。


MQTに最新のCOUNTやSUMが入っているので
オプティマイザはMQTから結果だけを拾ってくるように最適化する。

めっちゃ条件が厳しい。
少しでも条件をはみ出ると
「SQL20058N  マテリアライズ照会表 "SCS1.CUSTOMER_ORDER_SUMMER"に指定された全選択は無効です。  SQLSTATE=428EC」
が発生する。



http://www-3.ibm.com/cgi-bin/db2www/data/db2/udb/winos2unix/support/v8document.d2w/report?fn=r0000927.htm

When REFRESH IMMEDIATE is specified:

    * the fullselect must be a subselect
    * the fullselect cannot include a reference to a nickname (SQLSTATE 428EC)
    * the subselect cannot include:
          o functions that are not deterministic
          o scalar fullselects
          o predicates with fullselects
          o special registers 
    * a GROUP BY clause must be included in the subselect unless REPLICATED is specified, in which case a GROUP BY clause is not allowed.
    * The supported column functions are SUM, COUNT, COUNT_BIG and GROUPING (without DISTINCT). 
      The select list must contain a COUNT(*) or COUNT_BIG(*) column. If the materialized query table select list contains SUM(X), 
      where X is a nullable argument, the materialized query table must also have COUNT(X) in its select list. 
      These column functions cannot be part of any expressions.
    * if the FROM clause references more than one table or view, it can only define an inner join without using the explicit INNER JOIN syntax
    * all GROUP BY items must be included in the select list
    * GROUPING SETS, CUBE and ROLLUP are supported. The GROUP BY items 
      and associated GROUPING column functions in the select list must form a unique key of the result set. 
      Thus, the following restrictions must be satisfied:
          o no grouping sets may be repeated. For example, ROLLUP(X,Y), X is not allowed because it is equivalent to GROUPING SETS((X,Y),(X),(X))
          o if X is a nullable GROUP BY item that appears within GROUPING SETS, CUBE, or ROLLUP, then GROUPING(X) must appear in the select list
          o grouping on constants is not allowed 
    * a HAVING clause is not allowed
    * if in a multiple partition database partition group, then either the partitioning key must be a subset of the GROUP BY items, 
      or REPLICATED must be specified
    * if REPLICATED is specified, the table must have a unique key 


別紙の検証参照。




MQTの複製を使うことでブロードキャストの回避
マルチパーティション構成時にブロードキャストを回避できる。


REFRESH IMMEDIATE と REFRESH DEFERRED

REFRESH IMMEDIATEの設定のMQTは以下のようなメリット/デメリットがある。

    * Speed up the performance of relevant select statements.
    * Are automatically chosen by the optimizer whenever it makes sense.
    * Can degrade the performance of insert, update and delete statements.
    * Cannot be updated directly.
    * May occupy considerable disk space.
    * May have exclusive locks held during updates of their base tables.


オプティマイザがベース表の代わりにMQTを使うような設定

通常の最適化レベル(5)ではオプティマイザはMQTを使った検索を採用する。
正しく最適化させるためにベースの表とMQTとインデックスの統計情報は必要。









CURRENT QUERY OPTIMIZATIONの設定
1 Restricted optimization. Useful when memory and processing resources are severely restrained. 
Roughly equivalent to the optimization provided by Version 1. 

2 Slight optimization. Specifies a level of optimization higher than that of Version 1, 
but at significantly less optimization cost than levels 3 and above, especially for very complex queries. 

3 Moderate optimization. Comes closest to matching the query optimization characteristics of DB2 for MVS/ESA. 

5 Normal optimization. Recommended for a mixed environment using both simple transactions and complex queries. 

7 Normal optimization. The same as query optimization 5 
except that it does not reduce the amount of query optimization for complex dynamic SQL queries. 

0 Minimal optimization. Use only when little or no optimization is required 
(that is, for very simple queries on well-indexed tables). 

9 Maximum optimization. Uses substantial memory and processing resources. 
Use only if class 5 is insufficient (that is, for very complex and long-running queries that do not perform well at class 5). 
In general, use a higher optimization class for static queries and for queries that you anticipate will take a long time to execute, 
and a lower optimization class for simple queries that are submitted dynamically or that are run only a few times. 

To set the query optimization for dynamic SQL statements, enter the following command in the Command Line Processor: 

SET CURRENT QUERY OPTIMIZATION = n;

End of FILE.


2003/08/25-27 ugya@lycos.com