You need to enable JavaScript to run this app.
导航
Paimon 系统表
最近更新时间:2025.03.28 15:32:06首次发布时间:2025.03.28 15:32:06
我的收藏
有用
有用
无用
无用

系统表

每个表都有对应的系统表,系统表包含每个表相关的元数据及信息,例如创建的快照和使用的选项。用户可以通过批的方式查询访问系统表。目前,Flink、Spark、Trino 和 StarRocks 支持查询系统表。在某些情况下,表名需要用反引号包裹以避免语法解析冲突,例如三重访问模式:

SELECT * FROM my_catalog.my_db.`my_table$snapshots`;

快照表 (Snapshots Table)

通过快照表可以查询表的快照历史信息,包括快照中发生的记录数。

SELECT * FROM my_table$snapshots;

/*
+--------------+------------+-----------------+-------------------+--------------+-------------------------+--------------------------------+------------------------------- +--------------------------------+---------------------+---------------------+-------------------------+----------------+
|  snapshot_id |  schema_id |     commit_user | commit_identifier |  commit_kind |             commit_time |             base_manifest_list |            delta_manifest_list |        changelog_manifest_list |  total_record_count |  delta_record_count |  changelog_record_count |      watermark |
+--------------+------------+-----------------+-------------------+--------------+-------------------------+--------------------------------+------------------------------- +--------------------------------+---------------------+---------------------+-------------------------+----------------+
|            2 |          0 | 7ca4cd28-98e... |                 2 |       APPEND | 2022-10-26 11:44:15.600 | manifest-list-31323d5f-76e6... | manifest-list-31323d5f-76e6... | manifest-list-31323d5f-76e6... |                   2 |                   2 |                       0 |  1666755855600 |
|            1 |          0 | 870062aa-3e9... |                 1 |       APPEND | 2022-10-26 11:44:15.148 | manifest-list-31593d5f-76e6... | manifest-list-31593d5f-76e6... | manifest-list-31593d5f-76e6... |                   1 |                   1 |                       0 |  1666755855148 |
+--------------+------------+-----------------+-------------------+--------------+-------------------------+--------------------------------+------------------------------- +--------------------------------+---------------------+---------------------+-------------------------+----------------+
2 rows in set
*/

通过查询快照表,可以了解表的提交和过期信息,并通过时间旅行访问历史数据。

模式表 (Schemas Table)

通过模式表可以查询表的历史模式结构。

SELECT * FROM my_table$schemas;

/*
+-----------+--------------------------------+----------------+--------------+---------+---------+-------------------------+
| schema_id |                         fields | partition_keys | primary_keys | options | comment |       update_time       |
+-----------+--------------------------------+----------------+--------------+---------+---------+-------------------------+
|         0 | [{"id":0,"name":"word","typ... |             [] |     ["word"] |      {} |         | 2022-10-28 11:44:20.600 |
|         1 | [{"id":0,"name":"word","typ... |             [] |     ["word"] |      {} |         | 2022-10-27 11:44:15.600 |
|         2 | [{"id":0,"name":"word","typ... |             [] |     ["word"] |      {} |         | 2022-10-26 11:44:10.600 |
+-----------+--------------------------------+----------------+--------------+---------+---------+-------------------------+
3 rows in set
*/

可以通过联接快照表和模式表,获取指定快照的字段信息。

SELECT s.snapshot_id, t.schema_id, t.fields 
    FROM my_table$snapshots s JOIN my_table$schemas t 
    ON s.schema_id=t.schema_id where s.snapshot_id=100;

选项表 (Options Table)

通过选项表可以查询表在DDL中指定的选项信息。未显示的选项将使用默认值。

SELECT * FROM my_table$options;

/*
+------------------------+--------------------+
|         key            |        value       |
+------------------------+--------------------+
| snapshot.time-retained |         5 h        |
+------------------------+--------------------+
1 rows in set
*/

审计日志表 (Audit Log Table)

如需审计表的变更日志,可使用audit_log系统表。通过该表,可以获取增量数据的rowkind列,用于过滤等操作以完成审计。

rowkind有四种取值:

  • +I:插入操作
  • -U:更新操作(旧值)
  • +U:更新操作(新值)
  • -D:删除操作
SELECT * FROM my_table$audit_log;

/*
+------------------+-----------------+-----------------+
|     rowkind      |     column_0    |     column_1    |
+------------------+-----------------+-----------------+
|        +I        |      ...        |      ...        |
+------------------+-----------------+-----------------+
|        -U        |      ...        |      ...        |
+------------------+-----------------+-----------------+
|        +U        |      ...        |      ...        |
+------------------+-----------------+-----------------+
3 rows in set
*/

读优化表 (Read-optimized Table)

若需要极致的读取性能且可接受稍旧数据,可使用ro(读优化)系统表。该表通过仅扫描无需合并的文件提升性能。

对于主键表,ro系统表仅扫描最顶层文件(即最新 Full Compaction 结果)。注意不同分桶可能在不同时间进行 Full Compaction,因此不同键的值可能来自不同快照。

对于追加表,由于所有文件都无需合并,ro系统表行为与普通追加表一致。

SELECT * FROM my_table$ro;

文件表 (Files Table)

可查询表在特定快照下的文件信息。

-- 查询最新快照的文件
SELECT * FROM my_table$files;

/*
+-----------+--------+--------------------------------+-------------+-----------+-------+--------------+--------------------+---------+---------+------------------------+-------------------------+-------------------------+---------------------+---------------------+-----------------------+
| partition | bucket |                      file_path | file_format | schema_id | level | record_count | file_size_in_bytes | min_key | max_key |      null_value_counts |         min_value_stats |         max_value_stats | min_sequence_number | max_sequence_number |         creation_time |
+-----------+--------+--------------------------------+-------------+-----------+-------+--------------+--------------------+---------+---------+------------------------+-------------------------+-------------------------+---------------------+---------------------+-----------------------+
|       [3] |      0 | data-8f64af95-29cc-4342-adc... |         orc |         0 |     0 |            1 |                593 |     [c] |     [c] | {cnt=0, val=0, word=0} | {cnt=3, val=33, word=c} | {cnt=3, val=33, word=c} |       1691551246234 |       1691551246637 |2023-02-24T16:06:21.166|
|       [2] |      0 | data-8b369068-0d37-4011-aa5... |         orc |         0 |     0 |            1 |                593 |     [b] |     [b] | {cnt=0, val=0, word=0} | {cnt=2, val=22, word=b} | {cnt=2, val=22, word=b} |       1691551246233 |       1691551246732 |2023-02-24T16:06:21.166|
|       [2] |      0 | data-83aa7973-060b-40b6-8c8... |         orc |         0 |     0 |            1 |                605 |     [d] |     [d] | {cnt=0, val=0, word=0} | {cnt=2, val=32, word=d} | {cnt=2, val=32, word=d} |       1691551246267 |       1691551246798 |2023-02-24T16:06:21.166|
|       [5] |      0 | data-3d304f4a-bcea-44dc-a13... |         orc |         0 |     0 |            1 |                593 |     [c] |     [c] | {cnt=0, val=0, word=0} | {cnt=5, val=51, word=c} | {cnt=5, val=51, word=c} |       1691551246788 |       1691551246152 |2023-02-24T16:06:21.166|
|       [1] |      0 | data-10abb5bc-0170-43ae-b6a... |         orc |         0 |     0 |            1 |                595 |     [a] |     [a] | {cnt=0, val=0, word=0} | {cnt=1, val=11, word=a} | {cnt=1, val=11, word=a} |       1691551246722 |       1691551246273 |2023-02-24T16:06:21.166|
|       [4] |      0 | data-2c9b7095-65b7-4013-a7a... |         orc |         0 |     0 |            1 |                593 |     [a] |     [a] | {cnt=0, val=0, word=0} | {cnt=4, val=12, word=a} | {cnt=4, val=12, word=a} |       1691551246321 |       1691551246109 |2023-02-24T16:06:21.166|
+-----------+--------+--------------------------------+-------------+-----------+-------+--------------+--------------------+---------+---------+------------------------+-------------------------+-------------------------+---------------------+---------------------+-----------------------+
6 rows in set
*/

-- 查询指定快照的文件
SELECT * FROM my_table$files /*+ OPTIONS('scan.snapshot-id'='1') */;

/*
+-----------+--------+--------------------------------+-------------+-----------+-------+--------------+--------------------+---------+---------+------------------------+-------------------------+-------------------------+---------------------+---------------------+-----------------------+
| partition | bucket |                      file_path | file_format | schema_id | level | record_count | file_size_in_bytes | min_key | max_key |      null_value_counts |         min_value_stats |         max_value_stats | min_sequence_number | max_sequence_number |         creation_time |
+-----------+--------+--------------------------------+-------------+-----------+-------+--------------+--------------------+---------+---------+------------------------+-------------------------+-------------------------+---------------------+---------------------+-----------------------+
|       [3] |      0 | data-8f64af95-29cc-4342-adc... |         orc |         0 |     0 |            1 |                593 |     [c] |     [c] | {cnt=0, val=0, word=0} | {cnt=3, val=33, word=c} | {cnt=3, val=33, word=c} |       1691551246234 |       1691551246637 |2023-02-24T16:06:21.166|
|       [2] |      0 | data-8b369068-0d37-4011-aa5... |         orc |         0 |     0 |            1 |                593 |     [b] |     [b] | {cnt=0, val=0, word=0} | {cnt=2, val=22, word=b} | {cnt=2, val=22, word=b} |       1691551246233 |       1691551246732 |2023-02-24T16:06:21.166|
|       [1] |      0 | data-10abb5bc-0170-43ae-b6a... |         orc |         0 |     0 |            1 |                595 |     [a] |     [a] | {cnt=0, val=0, word=0} | {cnt=1, val=11, word=a} | {cnt=1, val=11, word=a} |       1691551246267 |       1691551246798 |2023-02-24T16:06:21.166|
+-----------+--------+--------------------------------+-------------+-----------+-------+--------------+--------------------+---------+---------+------------------------+-------------------------+-------------------------+---------------------+---------------------+-----------------------+
3 rows in set
*/

标签表 (Tags Table)

通过标签表可以查询表的标签历史信息,包括标签基于的快照和快照的部分历史信息。您还可以获取所有标签名称并按名称时间旅行到特定标签数据。

SELECT * FROM my_table$tags;

/*
+----------+-------------+-----------+-------------------------+--------------+--------------+
| tag_name | snapshot_id | schema_id |             commit_time | record_count |   branches   |
+----------+-------------+-----------+-------------------------+--------------+--------------+
|     tag1 |           1 |         0 | 2023-06-28 14:55:29.344 |            3 |      []      |
|     tag3 |           3 |         0 | 2023-06-28 14:58:24.691 |            7 |  [branch-1]  |
+----------+-------------+-----------+-------------------------+--------------+--------------+
2 rows in set
*/

消费者表 (Consumers Table)

可查询包含下一个快照ID的所有消费者。

SELECT * FROM my_table$consumers;

/*
+-------------+------------------+
| consumer_id | next_snapshot_id |
+-------------+------------------+
|         id1 |                1 |
|         id2 |                3 |
+-------------+------------------+
2 rows in set
*/

清单表 (Manifests Table)

可查询表的最新快照或指定快照中包含的所有清单文件。

-- 查询最新快照的清单
SELECT * FROM my_table$manifests;

/*
+--------------------------------+-------------+------------------+-------------------+---------------+
|                      file_name |   file_size |  num_added_files | num_deleted_files |     schema_id |
+--------------------------------+-------------+------------------+-------------------+---------------+
| manifest-f4dcab43-ef6b-4713... |        12365|               40 |                 0 |             0 |
| manifest-f4dcab43-ef6b-4713... |        1648 |                1 |                 0 |             0 |
+--------------------------------+-------------+------------------+-------------------+---------------+
2 rows in set
*/

-- 查询指定快照的清单
SELECT * FROM my_table$manifests /*+ OPTIONS('scan.snapshot-id'='1') */;
/*
+--------------------------------+-------------+------------------+-------------------+---------------+
|                      file_name |   file_size |  num_added_files | num_deleted_files |     schema_id |
+--------------------------------+-------------+------------------+-------------------+---------------+
| manifest-f4dcab43-ef6b-4713... |        12365|               40 |                 0 |             0 |
+--------------------------------+-------------+------------------+-------------------+---------------+
1 rows in set
*/

聚合字段表 (Aggregation Fields Table)

通过聚合字段表可以查询表的历史聚合信息。

SELECT * FROM my_table$aggregation_fields;

/*
+------------+-----------------+--------------+--------------------------------+---------+
| field_name |      field_type |    function  |               function_options | comment |
+------------+-----------------+--------------+--------------------------------+---------+
| product_id | BIGINT NOT NULL |           [] |                             [] |  <NULL> |
|      price |             INT | [true,count] | [fields.price.ignore-retrac... |  <NULL> |
|      sales |          BIGINT |        [sum] | [fields.sales.aggregate-fun... |  <NULL> |
+------------+-----------------+--------------+--------------------------------+---------+
3 rows in set
*/

分区表 (Partitions Table)

可查询表的分区文件信息。

SELECT * FROM my_table$partitions;

/*
+---------------+----------------+--------------------+--------------------+------------------------+
|  partition    |   record_count |  file_size_in_bytes|          file_count|        last_update_time|
+---------------+----------------+--------------------+--------------------+------------------------+
|  [1]          |           1    |             645    |                1   | 2024-06-24 10:25:57.400|
+---------------+----------------+--------------------+--------------------+------------------------+
*/

全局系统表

全局系统表包含Paimon中所有表的统计信息。为了方便查询,我们创建了一个名为sys的系统数据库。在Flink中可通过以下SQL显示所有全局系统表:

USE sys;
SHOW TABLES;

所有选项表 (ALL Options Table)

该表类似于选项表,但会显示所有数据库中所有表的选项。

SELECT * FROM sys.all_table_options;

/*
+---------------+--------------------------------+--------------------------------+------------------+
| database_name |                     table_name |                            key |            value |
+---------------+--------------------------------+--------------------------------+------------------+
|         my_db |                    Orders_orc  |                         bucket |               -1 |
|         my_db |                        Orders2 |                         bucket |               -1 |
|         my_db |                        Orders2 |               sink.parallelism |                7 |
|         my_db2|                      OrdersSum |                         bucket |                1 |
+---------------+--------------------------------+--------------------------------+------------------+
7 rows in set
*/

目录选项表 (Catalog Options Table)

通过目录选项表可查询目录的选项信息。未显示的选项将使用默认值。

SELECT * FROM sys.catalog_options;

/*
+-----------+---------------------------+
|       key |                     value |
+-----------+---------------------------+
| warehouse | hdfs:///path/to/warehouse |
+-----------+---------------------------+
1 rows in set
*/