[ 2010 September,10, Friday ]
中国中文信息学会
Chinese Information Processing Society of China
首页
学会简介
学会领导
学会办公室
工作委员会
专业委员会
学术活动
发展会员
钱伟长中文信息处理奖
科技工作者之家
中文信息学报
新书介绍
按年代和期次浏览(最新数据: 1999年第3期)
中文搜索引擎现状与展望
[作者]都云程; 卢献华;

[摘要]本文介绍了中文搜索引擎的发展现状,分析了中文搜索引擎中存在的问题,以及与国外先进的搜索引擎的差距,提出了中文搜索引擎的发展方向。

[Abstract]This article introduces the current status of Chinese search engines , analyses the problems involves with Chinese search engines, and makes a comparison with outstanding search engines of foreign countries. The article also talks about the techniques that need to enhance the Chinese search engines and the development direction of Chinese search engines.
[关键字]中文搜索引擎; 全文检索; 中文自动分词; 相关排序;



一种基于奇异值分解的双语信息过滤算法
[作者]路海明; 徐晋晖; 卢增祥; 李衍达;

[摘要]本文提出了一种基于SVD(奇异值分解)[1]的双语信息过滤[2]算法,将双语文档进行了统一的表示,使得适应于单语过滤的算法可以方便地用于双语过滤,同时对文档向量进行了压缩,滤去了噪声。在应用方面,将双语过滤算法用于互联网上的个性化主动信息过滤。

[Abstract]This paper introduces a SVD method in bilingual information filtering. It gives an uniform presentation to bilingual documents. Then any arithmetic used in monolingual information filtering can be easily used in bilingual information filtering. Using this method, we can compress the document vector and filter the noise. This method is used in personal information filtering on the Internet. W e provide the WWW Bookmark Service. Through user's Bookmark, we can get user's preference and recommend interesting ...
[关键字]双语信息过滤; SVD; 互联网; Bookmark服务;



小学语文ICAI系统诊断模块中的造句错误类型分析
[作者]杨开城;

[摘要]本文从句法分析的角度出发,全面分析了错误文本的技术表现,从信息处理的需要出发扩展了造词法的概念,提出把词语搭配错归入句法错的观点,并试着对“语义错”这种模糊的提法给出了明确的描述。本文在详细分析各种错误类型的特征的基础上,提出了相应的诊断策略。

[Abstract]In this paper the author analyzed those technical expression of erroneous sentences from the point of view of syntax analysis and extended the concept of wordbuilding rules to meet the need of information processing. The author ,trying to describe semnatic errors clearly and detailedly,believes that collocation errors should belong to syntax errors.
[关键字]造词法; 搭配; 语义;



基于统计与神经元方法相结合的手写体相似字识别
[作者]张德喜; 马少平; 朱绍文; 金奕江;

[摘要]本文提出了一种基于统计识别方法与人工神经元网络相结合的手写体相似汉字识别方法。该方法充分利用了统计识别方法和神经元网络识别方法的优点,不仅显著地提高了相似字的识别率,而且有效地提高了系统的整体性能。对相似字的识别率由79.02%提高到84.32%,提高了五个百分点,整体识别率提高了1.3个百分点。

[Abstract]This paper presents a method to recognize handwritten similar Chinese characters based on combining statistics model recognition method with artificial neural networks (ANN). This method takes advantage of the peculiarities of above two methods efficiently. It not only increases the recognition rate to similar Chinese characters, but also improves the performance of the system .The recognition rate to similar Chinese characters can be improved from 79.02% to 84.32%, about 5 percentage points improving; the ...
[关键字]神经元网络; 汉字识别; 相似字识别;



计算机辅助汉语教学系统中语音评价体系初探
[作者]郭巧; 陆际联;

[摘要]本文探讨和研究计算机辅助汉语教学系统中语音评价体系的组成与实现方法。采用标准普通话语音示教数据库和非特定人大词汇量标准普通话汉语语料数据库,建立标准普通话示教语句特征模板库。采用Kohonen自组织神经网络进行学习者语音信号的分类与识别,经过汉语语音教学效果评价系统的处理,获得相应的量化评价结果。初步给出了计算机辅助汉语教学系统中语音评价体系的总体框架及其实现方法。通过实验验证了本语音评价体系的设计方案是合理的、可行的。它基本上能够满足计算机辅助汉语教学系统在线评价学生语音学习效果的需要。

[Abstract]To study the frame and methods of the evaluation system for the learner's pronunciations in a computer aided Chinese teaching program. We build up the feature data base of the standard Putonghua teaching program, and use the Kohonen self-organized neural network in the process of pattern recognition. The quantitative evaluation results of the learner's pronunciations can be got after the treatment of the evaluation system. We propose the frame and the methods of the evaluation system for the learner's pron...
[关键字]语音信号处理; 语音信号评价; 计算机辅助教学;



粤-普机器翻译中的词处理
[作者]张小衡;

[摘要]粤语和普通话之间的机器翻译研究应首先考虑由粤语到普通话的书面语翻译,并以单词为突破口。本文重点讨论粤-普书面语机器翻译中的词处理,尤其是方言词处理,包括方言词的识别和方言词的翻译两方面,同时介绍一个已经初步实现了的单词级粤-普机器翻译试验系统。文章最后将给出结论和讨论。

[Abstract]Speech-to-speech MT (Machine Translation) between Cantonese and Putonghua is a quite demanding job. It makes sense to deal with Cantonese-to-Putonghua text MT first. The text MT program is an important component of a Cantonese-to-Putonghua speech MT system. In addition it can independently play a helpful role in language communication and serve as a CALL ( Computer-Assisted Language Learning) tool for standard Chinese writing as well. The critical task in Cantonese-to-Putonghua text MT is word processing, e...
[关键字]机器翻译; 粤语与普通话; 词处理;



一种自组织的汉语词义排歧方法
[作者]李涓子; 黄昌宁; 杨尔弘;

[摘要]长期以来,词义排歧一直被认为是自然语言处理的难题之一。本文用机器可读词典《现代汉语辞海》提供的搭配实例作为多义词的初始搭配知识,采用适当的统计和自组织方法自动扩大搭配集;为保证学习质量,在学习过程中逐渐增大上下文窗口的长度;提出使用搭配统计表的多元最大对数似然比词义排歧算法。最后,对本文提出的方法进行了实验,实验表明这种算法具有较高的正确率。

[Abstract]Word sense disambiguation has been a difficult problem in natural language processing. This paper presents a method of automatically increasing new collocations by the use of the collocations provided by a machine readable dictionary XianDaiHanYuCiHai; In order to assuring the learning quality, the size of context was enlarged gradually; In the procedure of learning and word sense disambiguating, author gives a multi maximal log word sense disambiguation algorithm. At last, the method wa...
[关键字]自然语言处理; 词义排歧; 自组织方法; 搭配;



汉语短语结构定界歧义类型分析及分布统计
[作者]詹卫东; 常宝宝; 俞士汶;

[摘要]本文对汉语短语结构的定界歧义做了全面考察,从歧义格式的组成成分,歧义对外造成的影响,模式歧义和实例歧义的对应关系三方面考察了短语结构定界歧义的不同类型,并对汉语短语结构定界歧义的不同类型进行了初步统计。希望能将计算机处理汉语时碰到的短语结构边界歧义问题进一步清晰化,供理论研究者和应用系统开发人员参考。

[Abstract]This paper analyses the ambiguity of determining boundaries of Chinese phrases in automatic parsing by computer. The type of ambiguity can be classified from three different perspectives. As viewed from component of ambiguous structures, ambiguous phrases can be classified into two kinds: one including terminal symbols, the other not including terminal symbols but only non-terminal symbols. As viewed from the influence of ambiguity, ambiguous phrases can also be classified into two k...
[关键字]短语; 短语定界歧义; 自然语言处理;



基于转换的汉语基本名词短语识别模型
[作者]赵军; 黄昌宁;

[摘要]基本名词短语的识别在自然语言信息处理领域具有重要作用。本文首先从语言学的角度提出了汉语基本名词短语的概念,然后从语言信息处理的角度将用于基本名词短语识别的知识分为两部分,即表示基本名词短语句法组成的基本结构模板(静态知识)与表示基本名词短语出现的上下文环境特征的转换规则(动态知识)。在此基础上设计了一种基于转换的基本名词短语识别模型,该模型可同时结合这两类知识识别基本名词短语。实验结果显示了较高的识别正确率

[Abstract]It is important to recognize the baseNP in the field of natural language processing. At first, the paper defines Chinese baseNP from the linguistic standpoint. Then the knowledge which is essential for baseNP recognition is analyzed from the standpoint of automatic language information processing. The recognition knowledge includes the basic construction templates which specify the syntactic composition of baseNPs(static knowledge) and the context sensitive transformative rules(dynamic knowledge) which ref...
[关键字]自然语言处理; 知识获取; 语料库; 名词短语;



现代汉语计算语言模型中语言单位的频度—频级关系
[作者]关毅; 王晓龙; 张凯;

[摘要]Zipf定律是一个反映英文单词词频分布情况的普适性统计规律。我们通过实验发现,在现代汉语的字、词、二元对等等语言单位上,其频度与频级的关系也近似地遵循Zipf定律,说明了Zipf定律对于汉语的不同层次的语言单位也是普遍适用的。本文通过实验证实了Zipf定律所反映的汉语语言单位频度—频级关系,并进而深入讨论了它对于汉语自然语言处理的各项技术,尤其是建立现代汉语基于统计的计算语言模型所具有的重要指导意义

[Abstract]Zipf's law has been widely researched by the linguists and statisticians.The frequency of English words is the most famous example of Zipf's law .In this paper,by means of experiments,we show that Zipf's law is also available in many language structures of Chinese (Chinese character, Chinese word,Chinese word bigram,etc),And Zipf's law has great effect on many technologies of Chinese language processing, especially the construction of Chinese computational language model.
[关键字]Zipf定律; 字频; 词频; 二元对频度;



共95页 当前第52页 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95   
©中国中文信息学会 1981-2007
京ICP备05039057号