吉林大学学报(信息科学版) ›› 2026, Vol. 44 ›› Issue (2): 421-426.

• • 上一篇    下一篇

基于 PAT 代数的大规模数据并行查询算法

孙晔欣, 夏 超   

  1. 长春工业大学 数学与统计学院, 长春 130062
  • 收稿日期:2024-02-22 出版日期:2026-04-14 发布日期:2026-04-15
  • 作者简介:孙晔欣(1995— ), 女, 长春人, 长春工业大学硕士研究生, 主要从事代数学、 极大加代数、 离散数学研究, ( Tel)86- 15604470020 (E-mail)Sunx5745@ 163. com; 夏超(1988— ), 男, 长春人, 长春工业大学讲师, 博士, 主要从事代数学、 离散数学研究, (Tel)86-13578709686(E-mail)xiachao@ ceutelu. cn。
  • 基金资助:
    吉林省教育厅科学研究基金资助项目(JJKH20230745KJ)

A Parallel Query Algorithm of Large Scale Data Based on PAT Algebra

SUN Yexin, XIA Chao   

  1. School of Mathematics and Statistics, Changchun University of Technology, Changchun 130062, China
  • Received:2024-02-22 Online:2026-04-14 Published:2026-04-15

摘要:

针对因未考虑大规模数据间存在的特征差异, 而以单一特征作为查询依据会造成查询误差较大的问题,提出一种基于 PAT(Pump Algebra Tutor)代数的大规模数据并行查询算法。采用 PAT 代数对并行数据语义和逻辑进行优化, 设置大规模并行数据的初始序列块, 求得数据块密度, 并按其在有向图中通过调节节点密度实现低权重点过滤, 由此实现有效过滤。 同时, 利用子查询乘积最小策略确定目标数据的所在序列点, 通过贪心规则在邻域集合中查找满足条件的子句集, 建立查询连接, 实现高效数据并行查询。 实验结果表明, 所提方法的数据传输量和查询量均较高, 说明其针对大规模数据能实现准确查询, 具有一定的实用价值。

关键词:

Abstract:

Without considering the feature differences between large-scale data, using a single feature as the query basis can result in significant query errors. Therefore, a parallel query algorithm for large-scale data based on PAT(Pump Algebra Tutor) algebra is proposed. Using PAT algebra to optimize the semantics and logic of parallel data, setting initial sequence blocks for large-scale parallel data, obtaining data block density, and implementing low weight key filtering in a directed graph by adjusting node density according to data block density, the effective filtering is achieved. On this basis, the strategy of minimizing the product of subqueries is used to determine the sequence points where the target data is located. Greedy rules are used to search for clause sets that meet the conditions in the neighborhood set, establish query connections, and achieve efficient parallel data queries. The experimental results show that the proposed method has high data transmission and query volume, indicating that it can achieve accurate queries for large-scale data and has certain practical value.


Key words:

中图分类号: 

  • TP391