 |
| [题目] | 双数组Trie树算法优化及其应用研究
| |
| [英文题目] | Research of Optimization on Double-Array Trie and its Application
| |
| [作者] | 王思力; 张华平; 王斌;
| |
| [英文名] | WANG Si-li~1; 2; ZHANG Hua-ping~1; WANG Bin~1 (Institution of Computing Technology; The Chinese Academy of Sciences; Beijing 100080; China; Graduate School of the Chinese Academy of Sciences; China);
| |
| [关键字] | 计算机应用; 中文信息处理; 双数组; Trie树; 词典; 分词;
| |
| [英文关键字] | computer application; Chinese information processing; Double-Array; TRIE; lexicon; word segmentation;
| |
| [摘要] | 本文对双数组Trie树(Doub le-Array Trie)算法提出了一种优化策略,即在采用Trie树构造数组的过程中,优先处理分支结点数更多的结点。这种优化策略可以在保证该算法数据查找效率不变的同时,进一步减少数据稀疏,提高空间利用率。我们基于该优化算法实现了一个词典管理程序,并与利用其他索引机制的词典进行了实验对比。实验结果表明,利用优化的双数组Trie树算法的词典不仅在查询速度上优于用其他索引机制的词典,而且存储数据的空间占用也比较小。
| |
| [英文摘要] | This paper proposes an improved strategy for the algorithm of Double-Array Trie that is,the node with most child nodes is praessed firstly when constructing the array.This strategy can reduce the data sparseness and keep the search efficiency meanwhile.We implement a program for lexicon management base on the improved Double-Array Trie and compare it with other index mechanisms.The results clearly show that the improved Double-Array-Trie algorithm has a much higher search speed and needs a smaller space for...
| |
| [期刊] | 2006年第5期 |
|