Thursday, 28 July 2016

Is Your Database Healthy?

Despite the sophistication of the latest DB2 software versions and the power of current IBM z/server technology, it is still possible for performance and data availability to deteriorate due to a variety of things, including increased dataset extents, loss of clustering, index page splits, and other factors.

This article presents simple SQL statements*  that the database administrator (DBA) can execute against the DB2 catalog to determine if one or more application databases suffer from common maladies, and what the DBA can do to fix or mitigate potential problems.

Application Table and Tablespace Performance


In DB2 for z/OS, tablespaces are collections of physical datasets that contain table rows.  As table rows are inserted, updated, and deleted the internal structure of the tablespace can become disorganized, increasing unused space, lengthening row retrieval elapsed times, and increasing CPU used.  Use the following queries to find specific issues.
  • Tablespace partitions with relocated rows. When an application updates a row, DB2 attempts to write it back to the same location in the tablespace. However, sometimes no space is available; for example, if the row length has increased. In these cases, DB2 inserts the row somewhere else in the tablespace, and replaces the original row location with a pointer to the new row location. As this happens to more and more rows, the pointers take up more and more space. In addition, rows may no longer be stored in clustering sequence.  Address this issue by executing the Reorg utility on the tablespace.
    SELECT  TPT.DBNAME, TPT.TSNAME
    ,SUBSTR(DIGITS(TPT.PARTITION),4,2)    AS P#    
    ,DECIMAL(TPT.CARDF,11,0)          AS CARD  
    ,'FARINDREF OVER 5%    (FARINDREF, % OF CARD)'
                                     AS DESCRIPTION      
    ,TPT.FARINDREF
    ,DECIMAL(TPT.FARINDREF * 100. / TPT.CARDF,11,0)
             AS  PCT_OF_CARD    
    ,DATE(TPT.STATSTIME)  AS STATS_DATE
    FROM    SYSIBM.SYSTABLEPART   TPT      
    WHERE  TPT.DBNAME NOT LIKE 'SYS%'      
       AND TPT.DBNAME NOT LIKE 'DSN%'  
    AND  (TPT.FARINDREF * 20 > TPT.CARD)  -- OVER 5%
    ORDER BY  7 DESC, 1, 2  ;
  • Tablespaces that are poorly compressed.  The Compress option specifies that DB2 is to use a common data compression algorithm to store table rows. In general, textual data can benefit greatly from this, with compression rates as high as 75% or greater.  However, there is a cost: as rows are stored or retrieved, DB2 must compress or decompress the rows, and this costs CPU cycles. The following query determines if a tablespace is poorly compressed (the criteria used is below 10% compression).  If you have tablespaces like this, consider removing the Compress option and running the Reorg utility, since table access is costing CPU cycles without significant space gains.
    SELECT  TPT.DBNAME, TPT.TSNAME, TPT.PARTITION    AS P#
      ,DECIMAL(TPT.CARDF,11,0)
      ,TPT.PAGESAVE     AS SAVED    
      FROM    SYSIBM.SYSTABLEPART  TPT
      WHERE  TPT.DBNAME NOT LIKE 'SYS%'    
        AND TPT.DBNAME NOT LIKE 'DSN%'
        AND TPT.COMPRESS = 'Y'  
        AND TPT.PAGESAVE < 10  
    ORDER BY  5, 1, 2, 3 ;
  • Tablespaces without recent backups. Database administrators (DBAs) back up tablespaces using the Image Copy utility. This is done for several reasons: to provide point-in-time recovery data if a rogue application overlays or changes data; to serve as a restore point if an application fails and must be re-run with the tablespace in its original condition; and as a contingency in case of a disaster. Any tablespaces without backups, or without recent backups, create potential recovery issues. The following query lists all tablespaces without backups within the last 30 days.  Analyze this list, and update regular backup procedures to include any tablespaces that may require recovery.
    SELECT  TPT.DBNAME, TPT.TSNAME, TPT.PARTITION
    FROM  SYSIBM.SYSTABLEPART   TPT
    WHERE  TPT.DBNAME NOT LIKE 'DSN%'  
     AND NOT EXISTS
    (SELECT  1  FROM  SYSIBM.SYSCOPY   CPY
      WHERE  CPY.DBNAME = TPT.DBNAME    
       AND CPY.TSNAME = TPT.TSNAME    
       AND  (    CPY.DSNUM = TPT.PARTITION    
             OR  CPY.DSNUM = 0)
       AND  CPY.ICTYPE IN ('F', 'I')  
       AND  CPY.TIMESTAMP >  (CURRENT TIMESTAMP - 30 DAYS)   )    
    ORDER BY  1, 2, 3  

Application Index Performance

In DB2, table indexes are used for multiple reasons:
  • To support Primary Keys;
  • To support uniqueness constraints;
  • To provide a mechanism for clustering table rows in order by key value; and
  • To serve as a high-performance access path to keyed data.
It is the third and fourth reasons that we will concentrate on.  Many applications issue queries for single rows or small numbers of rows that contain the same key value.  Examples include: an on-line order entry program that accesses a product table row by its product number; a payroll application that updates an employee table row based on the employee id; and a health care app that accesses a patient’s current treatments based on a medical id number.

Indexes can lose their performance edge by frequent use, lack of good design, or lack of frequent reorganization. Here are a few queries to help you find possible low-performance indexes.
  • Clustering indexes with poor clustering. A clustering index is used to assist DB2 in maintaining table rows in physical sequence by a column value. For example, a customer table may benefit from having rows stored physically ascending by customer number. DB2 will not attempt to maintain this clustering without a clustering index. In addition, there may be times when DB2 cannot keep this sequence. For example, DB2 may attempt to store a row in a preferred physical location but no free space exists for that row. As time goes on, tables become less and less clustered, leading to increased query elapsed times.  Use the following query to identify tables currently having a poor clustering percentage.  Execute the Reorg utility on the table to return it to full clustering.
    SELECT  IDX.DBNAME, IDX.NAME, IDX.TBNAME  
    ,DECIMAL(IDX.FULLKEYCARDF,11,0)  AS  FULLKEYCARD  
    ,'CL. IDX, POOR CL% (CL-RATIO)'  AS DESCRIPTION
    ,IDX.CLUSTERRATIO  
    ,DATE(IDX.STATSTIME) AS STATS_DATE    
    FROM   SYSIBM.SYSINDEXES     IDX
    WHERE  IDX.DBNAME NOT LIKE 'SYS%'
      AND IDX.DBNAME NOT LIKE 'DSN%'  
      AND IDX.CLUSTERING = 'Y'  
      AND IDX.CLUSTERRATIO  < 66
    ORDER BY 6, 1, 2  
  • Indexes with poor cardinality.  Indexes work best when they contain entries for unique rows. However, sometimes indexes are not designed in this way, and instead contain duplicate entries. Consider a customer address table with a column called State. An index on this column only would not generally be useful, as there are a relatively small number of states compared to the large number of customers. This can be generalized as follows: any index where each key entry consists of an average of 100 duplicates is probably not very useful.  Use the following query to determine these indexes. Address the issue by re-defining the indexes to contain a larger cardinality of values.
    SELECT TBL.DBNAME, TBL.TSNAME, TBL.NAME, IDX.NAME    
    ,DECIMAL(IDX.FULLKEYCARDF,11,0)  AS IDX_FKYCRD    
    ,DECIMAL(TBL.CARDF       ,11,0)  AS TBL_CARD
    FROM  SYSIBM.SYSTABLES      TBL
         ,SYSIBM.SYSINDEXES     IDX
    WHERE  TBL.DBNAME  = IDX.DBNAME  
       AND TBL.NAME    = IDX.TBNAME    
       AND TBL.CREATOR = IDX.TBCREATOR
       AND TBL.DBNAME NOT LIKE 'SYS%'  
       AND TBL.DBNAME NOT LIKE 'DSN%'  
       AND TBL.TYPE = 'T'  
    AND  IDX.FULLKEYCARDF * 100  <  TBL.CARDF  
    ORDER BY 5  
  • Proliferation of Indexes.  While indexes are frequently used for performance enhancements, having a large number of indexes on a single table may indicate one or more of the following:  poor index design, especially if two or more indexes are identical; poor index design if two indexes share the same initial columns (Index #1 contains columns A and B, while Index #2 contains columns A, B and C); potential table maintenance issues, as a reorganization of the table usually forces reorganization of all indexes, thus extending Reorg utility elapsed time; and potentially elongated recovery time, since recovering a table after a failure or a disaster also requires recovering or rebuilding all the indexes. Use the following query to list tables having more than seven indexes. Use this list to analyze the noted tables and their indexes for possible issues.
    SELECT  TBL.CREATOR , TBL.NAME , COUNT(*)   AS #_IDX  
    FROM  SYSIBM.SYSTABLES  TBL    
         ,SYSIBM.SYSINDEXES IDX    
    WHERE  TBL.TYPE = 'T'
     AND  TBL.NAME    = IDX.TBNAME  
     AND  TBL.CREATOR = IDX.TBCREATOR

     AND TBL.DBNAME NOT LIKE 'DSN%'  
     AND TBL.DBNAME NOT LIKE 'SYS%'  
    GROUP BY TBL.CREATOR, TBL.NAME    
    HAVING  COUNT(*) > 7  
    ORDER BY  3 DESC