数据库是较大型的应用,对于繁忙的数据库,需要消耗大量的内存、CPU、IO、网络资源。SQL 优化是数据库优化的手段之一,而为了达到 SQL 优化的最佳效果,您首先需要了解最消耗资源的 SQL(Top SQL),例如 IO 消耗最高的 SQL。
数据库资源分为多个维度、CPU、内存、IO 等,为能够从各个维度层面查找最消耗数据库资源的 SQL,您可以使用 pg_stat_statements 插件统计数据库的资源开销和分析 Top SQL。
本文将通过示例介绍如何创建 pg_stat_statements 插件、如何分析 Top SQL 以及如何重置统计信息。
执行如下命令,在需要查询 TOP SQL 的数据库中,创建 pg_stat_statements 插件。
CREATE EXTENSION pg_stat_statements;
pg_stat_statements 输出内容介绍
通过查询 pg_stat_statements 视图,您可以得到数据库资源开销的统计信息。SQL 语句中的一些过滤条件在 pg_stat_statements 中会被替换成变量,可以减少重复显示的问题。
       pg_stat_statements 视图包含了一些重要信息,例如: 
      
 
     - SQL 的调用次数,总耗时,最快执行时间,最慢执行时间,平均执行时间,执行时间的方差(看出抖动),总共扫描、返回或处理了多少行。
- shared buffer 的使用情况:命中、未命中、产生脏块、驱逐脏块。
- local buffer 的使用情况:命中、未命中、产生脏块、驱逐脏块。
- temp buffer 的使用情况:读了多少脏块、驱逐脏块。
- 数据块的读写时间。
       下表列出了 pg_stat_statements 输出内容中各参数的含义。 
       
       
 
     
 
    | 参数名称 | 类型 | 参考 | 说明 | 
|---|---|---|---|
| userid | oid | pg_authid.oid | OID of user who executed the statement. | 
| dbid | oid | pg_database.oid | OID of database in which the statement was executed. | 
| queryid | bigint | 无 | Internal hash code, computed from the statement’s parse tree. | 
| query | text | 无 | Text of a representative statement. | 
| calls | bigint | 无 | Number of times executed. | 
| total_time | double precision | 无 | Total time spent in the statement, in milliseconds. | 
| min_time | double precision | 无 | Minimum time spent in the statement, in milliseconds. | 
| max_time | double precision | 无 | Maximum time spent in the statement, in milliseconds. | 
| mean_time | double precision | 无 | Mean time spent in the statement, in milliseconds. | 
| stddev_time | double precision | 无 | Population standard deviation of time spent in the statement, in milliseconds. | 
| rows | bigint | 无 | Total number of rows retrieved or affected by the statement. | 
| shared_blks_hit | bigint | 无 | Total number of shared block cache hits by the statement. | 
| shared_blks_read | bigint | 无 | Total number of shared blocks read by the statement. | 
| shared_blks_dirtied | bigint | 无 | Total number of shared blocks dirtied by the statement. | 
| shared_blks_written | bigint | 无 | Total number of shared blocks written by the statement. | 
| local_blks_hit | bigint | 无 | Total number of local block cache hits by the statement. | 
| local_blks_read | bigint | 无 | Total number of local blocks read by the statement. | 
| local_blks_dirtied | bigint | 无 | Total number of local blocks dirtied by the statement. | 
| local_blks_written | bigint | 无 | Total number of local blocks written by the statement. | 
| temp_blks_read | bigint | 无 | Total number of temp blocks read by the statement. | 
| temp_blks_written | bigint | 无 | Total number of temp blocks written by the statement. | 
| blk_read_time | double precision | 无 | Total time the statement spent reading blocks, in milliseconds (if track_io_timing is enabled, otherwise zero). | 
| blk_write_time | double precision | 无 | Total time the statement spent writing blocks, in milliseconds (if track_io_timing is enabled, otherwise zero). | 
分析 TOP SQL
- 最耗 IO SQL 
       - 执行如下命令,查询单次调用最耗 IO SQL TOP 5。SELECT userid::regrole, dbid, query FROM pg_stat_statements ORDER BY (blk_read_time+blk_write_time)/calls DESC LIMIT 5; 
- 执行如下命令,查询总最耗 IO SQL TOP 5。SELECT userid::regrole, dbid, query FROM pg_stat_statements ORDER BY (blk_read_time+blk_write_time) DESC LIMIT 5; 
 
- 执行如下命令,查询单次调用最耗 IO SQL TOP 5。
- 最耗时 SQL 
       - 执行如下命令,查询单次调用最耗时 SQL TOP 5。SELECT userid::regrole, dbid, query FROM pg_stat_statements ORDER BY mean_time DESC LIMIT 5; 
- 执行如下命令,查询总最耗时 SQL TOP 5。SELECT userid::regrole, dbid, query FROM pg_stat_statements ORDER BY total_time DESC LIMIT 5; 
 
- 执行如下命令,查询单次调用最耗时 SQL TOP 5。
- 响应时间抖动最严重 SQL 
       执行如下命令,查询响应时间抖动最严重 SQL。SELECT userid::regrole, dbid, query FROM pg_stat_statements ORDER BY stddev_time DESC LIMIT 5; 
- 最耗共享内存 SQL 
       执行如下命令,查询最耗共享内存 SQL。SELECT userid::regrole, dbid, query FROM pg_stat_statements ORDER BY (shared_blks_hit+shared_blks_dirtied) DESC LIMIT 5; 
- 最耗临时空间 SQL 
       执行如下命令,查询最耗临时空间 SQL。SELECT userid::regrole, dbid, query FROM pg_stat_statements ORDER BY temp_blks_written DESC LIMIT 5; 
重置统计信息
pg_stat_statements是累积的统计,如果要查看某个时间段的统计,需要查询快照的信息,详情请参见《PostgreSQL AWR报告(for 阿里云ApsaraDB PgSQL)》。
       您也可以通过执行如下命令,来定期清理历史统计信息。 
      
 
    SELECT pg_stat_statements_reset();