DBSTAT 虚拟表

一、概述

DBSTAT 虚拟表是一个只读的同名虚拟表，它返回有关用于存储 SQLite 数据库内容的磁盘空间量的信息。DBSTAT 虚拟表的示例用例包括sqlite3_analyzer.exe 实用程序和 Fossil 实现的 SQLite 版本控制系统中的表大小饼图。

当使用SQLITE_ENABLE_DBSTAT_VTAB编译时选项构建 SQLite 时，DBSTAT 虚拟表可用于所有数据库连接。

DBSTAT 虚拟表是一个同名的虚拟表，这意味着在使用它之前不需要运行CREATE VIRTUAL TABLE来创建 dbstat 虚拟表的实例。“dbstat”模块名可以像表名一样使用，直接查询dbstat虚拟表。例如：

SELECT * FROM dbstat;

如果需要使用 dbstat 模块的命名虚拟表，那么创建 dbstat 虚拟表实例的推荐方法如下：

CREATE VIRTUAL TABLE temp.stat USING dbstat(main);

注意“温度”。虚拟表名称（“stat”）之前的限定符。此限定符使虚拟表成为临时的 - 仅在当前数据库连接期间存在。这是推荐的方法。

dbstat 的“主要”参数是要为其提供信息的默认架构。默认是“main”，所以在上面的例子中使用“main”是多余的。对于任何特定的查询，可以通过在查询的 FROM 子句中将替代模式指定为虚拟表名称的函数参数来更改模式。（有关更多详细信息，请参阅FROM 子句中对表值函数的进一步讨论。）

DBSTAT 虚拟表的架构如下所示：

CREATE TABLE dbstat(
  name       TEXT,        -- Name of table or index
  path       TEXT,        -- Path to page from root
  pageno     INTEGER,     -- Page number, or page count
  pagetype   TEXT,        -- 'internal', 'leaf', 'overflow', or NULL
  ncell      INTEGER,     -- Cells on page (0 for overflow pages)
  payload    INTEGER,     -- Bytes of payload on this page or btree
  unused     INTEGER,     -- Bytes of unused space on this page or btree
  mx_payload INTEGER,     -- Largest payload size of all cells on this row
  pgoffset   INTEGER,     -- Byte offset of the page in the database file
  pgsize     INTEGER,     -- Size of the page, in bytes
  schema     TEXT HIDDEN, -- Database schema being analyzed
  aggregate  BOOL HIDDEN  -- True to enable aggregate mode
);

DBSTAT 表仅报告数据库文件中 btrees 的内容。分析中省略了空闲列表页面、指针映射页面和锁定页面。

默认情况下，数据库文件的每个 btree 页在 DBSTAT 表中都有一行。每行提供有关该数据库页面的空间利用率的信息。但是，如果隐藏列“aggregate”为 TRUE，则结果将被聚合，并且数据库中的每个 btree 的 DBSTAT 表中都有一行，提供有关整个 btree 的空间利用率的信息。

2.dbstat虚拟表的“路径”列

“路径”列描述了从 btree 结构的根节点到每个页面的路径。根节点本身的“路径”是'/'。当“聚合”为真时，“路径”为空。btree 页面根的最左侧子页面的“路径”是“/000/”。（Btrees 从左到右存储内容，所以左边的页面比右边的页面有更小的键。）根页面的最左边的孩子是'/001'，依此类推，每个兄弟页面由 3 位十六进制值标识。最左边第 451 个兄弟节点的子节点具有诸如“/1c2/000/”、“/1c2/001/”等路径。通过向路径附加“+”字符和六位十六进制值来指定溢出页面到它们链接的单元格。例如，

'/1c2/000+000000'         // First page in overflow chain
'/1c2/000+000001'         // Second page in overflow chain
'/1c2/000+000002'         // Third page in overflow chain

如果使用 BINARY 排序顺序对路径进行排序，则与单元格关联的溢出页面将在排序顺序中比其子页面更早出现：

'/1c2/000/'               // Left-most child of 451st child of root

3.汇总数据

从 SQLite 版本 3.31.0 (2020-01-22) 开始，DBSTAT 表有一个名为“聚合”的新隐藏列，如果将其约束为 TRUE，将导致 DBSTAT 在数据库中为每个 btree 生成一行，而不是一行每页行数。在聚合模式下运行时，“path”、“pagetype”和“pgoffset”列始终为 NULL，“pageno”列保存的是整个 btree 中的页数，而不是对应的页数排。

下表显示了正常模式和聚合模式下 DBSTAT 的（非隐藏）列的含义：

Column Normal meaning Aggregate-mode meaning

name The name of the table or index that is implemented by the btree of the current row
path See description above Always NULL
pageno The page number of the database page for the current row The total number of pages in the btree for the current row
pagetype 'leaf' or 'interior' Always NULL
ncell Number of cells on the current page or btree
payload Bytes of useful payload on the current page or btree
unused Unused bytes of on the current page or btree
mx_payload The largest payload found anywhere in the current page or btree.
pgoffset Byte offset to the start of the page Always NULL
pgsize Total storage space used by the current page or btree.

Column	Normal meaning	Aggregate-mode meaning
name	The name of the table or index that is implemented by the btree of the current row
path	See description above	Always NULL
pageno	The page number of the database page for the current row	The total number of pages in the btree for the current row
pagetype	'leaf' or 'interior'	Always NULL
ncell	Number of cells on the current page or btree
payload	Bytes of useful payload on the current page or btree
unused	Unused bytes of on the current page or btree
mx_payload	The largest payload found anywhere in the current page or btree.
pgoffset	Byte offset to the start of the page	Always NULL
pgsize	Total storage space used by the current page or btree.

4. dbstat 虚拟表的使用示例

要查找用于在架构“aux1”中存储表“xyz”的总页数，请使用以下两个查询之一（第一个是传统方式，第二个显示聚合功能的使用）：

SELECT count(*) FROM dbstat('aux1') WHERE name='xyz';
SELECT pageno FROM dbstat('aux1',1) WHERE name='xyz';

要查看表内容在磁盘上的存储效率，请计算用于保存实际内容的空间量除以已用磁盘空间总量。这个数字越接近 100%，包装效率越高。（在这个例子中，假设“xyz”表在“主”模式中。同样，有两个不同的版本分别显示了 DBSTAT 的使用，分别没有和有新的聚合功能。）

SELECT sum(pgsize-unused)*100.0/sum(pgsize) FROM dbstat WHERE name='xyz';
SELECT (pgsize-unused)*100.0/pgsize FROM dbstat
 WHERE name='xyz' AND aggregate=TRUE;

要查找表的平均扇出，请运行：

SELECT avg(ncell) FROM dbstat WHERE name='xyz' AND pagetype='internal';

当磁盘访问是顺序的时，现代文件系统运行得更快。因此，如果数据库文件的内容在连续页面上，SQLite 将运行得更快。要找出数据库中有多少页是顺序的（从而获得可能有助于确定何时进行VACUUM的度量），请运行如下查询：

CREATE TEMP TABLE s(rowid INTEGER PRIMARY KEY, pageno INT);
INSERT INTO s(pageno) SELECT pageno FROM dbstat ORDER BY path;
SELECT sum(s1.pageno+1==s2.pageno)*1.0/count(*)
  FROM s AS s1, s AS s2
 WHERE s1.rowid+1=s2.rowid;
DROP TABLE s;