[ 2010 September,09, Thursday ]
中国中文信息学会
Chinese Information Processing Society of China
首页
学会简介
学会领导
学会办公室
工作委员会
专业委员会
学术活动
发展会员
钱伟长中文信息处理奖
科技工作者之家
中文信息学报
新书介绍
按年代和期次浏览(最新数据: 1995年第4期)
汉语输入编码中简码字、词的合理选配
[作者]韩布新; 任雪松;

[摘要]本文分析了几种常用汉字编码方案的简码字表,发现有很多不一致之处。考虑到简码字的合理选取数量、记忆量和键位安排等因素,提出汉字的使用频度/构词能力级比率要较单纯使用频度指标更为合理。根据这一指标,在"汉字属性信息数据库"基础上,找出了78个简码字和120个简码及字词,并进行了相应的键位安排以便于实际应用。

[Abstract]Abstract The brevity coding characters lists of several coding systems in common use were quite different. From the respects or number,memory span and arrangement of key position,a new parameter was proposed instead of usage prequency of character,that is the ratio of usage frequency of a character and its grade of forming words.78 brevity coding characters and 120 brevity coding words combining with double characters were selected according this parameter
[关键字]简码; 双字词; 记忆量; 汉字属性; 输入编码; 构词能力; 信息数据库;



汉语教学系统CTS的设计与实现
[作者]王雍; 陈增武; 王泽兵;

[摘要]CTS是一个针对以"语法一结构"为大纲,结合功能法教学的教学策略实现的汉语教学系统。本文详细描述了CTS的课件库组织结构和对象管理机构的管理模型,并在此基础上介绍了CTS的设计与实现。

[Abstract]CTS is a Chinese Tutoring System mainly depending on the stragegy to form thecourseware with Grammar-Structure.and on the function instruction.This paper describesthe organization of the courseware base and the model of the object managing organizationin detail, based on which the design and implementation of CTS are introduced.
[关键字]课件; 超文本; 单媒介对象; 链接;



汉语语料的自动分类
[作者]吴军; 王作英; 禹锋; 王侠;

[摘要]语料库语言学的发展要求语料库的规模越来越大。随着电子出版业的迅速发展,获取大量机读文本建立大规模语料库已成为可能。但是收集来的粗语料是杂乱无章的,在作加工整理前必须分类。若用手工分类则工作量很大。本文介绍了一种语料自动分类办法。它采用文中提出的语料相关系数的概念,并利用不同类语料相关系数不同的特点进行分类,取得了93%的大类分类正确率。

[Abstract]he corpus lingUistics requires a very large corpus. Huge amount of computere-reada-ble texts for building very large corpus become available as the rapid development of electronic publishing in recent years.However,the raw texts, which are miscellaneous,must beclassified on domain before further processing.To classify texts is a tedious job for human.Therefore,we present in this paper a new approach of classifying the raw texts automaticaly The related coefficient of texts, which is defined in this paper,co...
[关键字]语料库; 语料分类; 相关系数;



MMT(ODA)项目中基于中间语言的分析和生成的机制
[作者]董亦农; 郭锐;

[摘要]由日本联合中国、泰国、马来西亚和印度尼西亚共同开发的多国语言机器翻译(MMT)系统采用了中间语言的方法。本文简要地介绍了该项目的概况,比较详细地介绍了该MMT系统的中间语言和分析、生成的机制."概念"是中间语言的最基本的词汇,本文尝试加以严格定义,并且对于理想的概念辞典给出完全性、必要性、独立性和协调性的新提法。为了使机器翻译界的专家们能够通过篇幅不长的论文准确而完整地了解该系统的核心,本文试图用形式化的方法定义系统的分析规则和生成规则的描述语言.本文是作者关于MMT的一些思考和总结,希望能对今后这方面的工作有所裨益。

[Abstract]
[关键字]机器翻译; MMT; 中间语言; 分析; 生成;



规则和统计相结合的汉语词类标注方法
[作者]周强;

[摘要]本文分析了汉语的多类词现象与汉语词类标注的困难,介绍了汉语词类标注中的规则排歧和统计排歧的处理策略以及规则和统计相结合的处理思路。按此思路设计的软件系统,对封闭语料和开放语料的标注正确率分别达到了96.06%和95.82%。

[Abstract]Abstract In this paper,we analyze category ambiguities of Chinese words,and introduce the schemes of rulebased disambiguation and statistics-based disambiguation in Chinese corpus tagging.We also propose a method blending rule-based processing with statistics-based processing.Using this method to tag Chinese corpus,we get the tagging accuracy of 96.06%(close testing) and 95.82% (open testing).
[关键字]汉语语料库; 人工校对; 自动标注; 排歧处理; 词类标注;



汉字认知模型与形码方案设计
[作者]何克杭;

[摘要]本文在深入分析人类识别汉字认知模型的基础上将认知心理学的理论方法系统地应用于汉字编码的形码方案设计的全过程。首先根据汉字认知模型提出三条相似性原则,对汉字末级部件进行合理归并;第二,根据人类联想记忆特征对部件进行科学分类;第三,运用短时记忆的组块理论和汉字构形理论制定汉字拆分规则;第四,运用图式理论制定汉字编码规则。整个方案符合中小学生的认知结构和认知特点,达到较好的规范性、易学性与快速性。经一批试点学校的试验证明,按认知理论研制的编码方案不仅有易学、易记的特点,而且能促进中小学的语文教学改革,有效地提高语文教学的质量和效率。

[Abstract]Abstract Based on the deepgoing analysis of human cognitive model for recognizing Chinese characters this paper systematically applies the theory and method of cognitive psychology to the whole design process of Chinese character componential coding scheme: First, on the basis of the cognitive model of Chinese characters" three similarity principles" for reasonable mergence of Chinese character components are presented. Second, according to the characteristic of human associate memory the Chinese character ...
[关键字]认知结构; 认知模型; 拆分; 易学性; 分类系统; 认知过程; 基本部件; 汉字编码规则; 认知心理学; 汉字识别;



部件组合──潜在的汉字结构层次
[作者]韩布新;

[摘要]本文提出了汉字结构中的一个潜在层次──部件组合,并对其在汉字编码字符集(基本集)中的分布特征进行了统计分析,发现绝大多数组合的组字次数和频率都很低,高频组合很少。文中列出了低频组合中的60个高频部件,以供汉字编码输入参考。最后讨论了部件组合在汉语教学及认知心理学研究等方面的应用意义。

[Abstract]Abstract A latent structural unit of Chinese character-combination of Chinese character constituents(CCCC)-was introduced. It was found that CCCC had an uneven distribution in the Basic Chinese Character Set (GB2312-80).It were also discussed about the cognitive effects of CCCC on the human cognition psychology, and their application in Chinese information processing and education.
[关键字]部件组合; 汉字输入; 组字次数;



基于篇章理解的自动文摘研究
[作者]王建波; 杜春玲; 王开铸;

[摘要]本文在自然语言理解的基础上,对自动文摘系统进行研究。研究在两级上进行,一级是基于中心名词,中心动词同其修饰成分间关系研究,实现句子成分过滤,完成句子主干提取。一级是上下文级,基于篇章文体结构,句子语义关联,分别实现摘要基集产生算法,摘要基集扩充算法及摘要集合生成算法。摘要基集的产生依赖于中心段,中心句选取,摘要基集扩充和摘要集合产生依赖于模糊语义距离的测试。

[Abstract]The study of automatic abstraction based on natural language understanding is a practical but a difficult branch in information processing. It consists of study of natural language understanding and automatic abstracting. Text meaning representation (TMR) is designed on the requirement of automatic abstraction system. TMR is an attributed tree structure. Rule based technique and semantics directed ATN technique are employed in sentence meaning analysing. A conjunction matching algorithem has been implemente...
[关键字]自然语言理解; 自动文摘; 信息处理; 自然语言语义; 模糊理论;



一种汉语电子词典的新结构
[作者]刘东立; 滕永林; 姚天顺;

[摘要]汉语电子词典是汉语机器翻译系统的最基本的组成部分,其组织结构的好坏对整个系统的效率具有直接的影响。本文提出一种节省存储空间且查询高效的汉语词典存储结构:以领头字为关键字的一级索引结构.通过理论推导和实例说明,证明了该结构的高效性和实用性。文中对词典的一般组织结构作了简单的介绍,并通过实例将其与新的词典结构作了比较。

[Abstract]AbstractThe Chinese electronic dictionary is a basic component of Chinese machine translation system.That whether a structure of the dictionary is good or bad affects the efficiency of the system directly. Here a kind of structure or Chinese dictionary which saves storage spaces and queries words efficiently is given. The structure is called one level index dictionary structure using headword as a key. We have proved that the structure is efficient and practical theoretically and using real examples. We hav...
[关键字]机器翻译; 电子词典;



邮政编码自动识别系统的研制
[作者]盛立东; 师春礼;

[摘要]本文从实际应用出发,在对预处理中存在的问题进行分析解决的基础上,开发了一套基于神经网络和锯法结构相结合的邮政编码自动识别系统。本系统软件采用C语言编写在486微机上实现。在对13970个扫描数据测试中,其识别率为98.43%,产生的误识率为0.136%,拒识率为1.43%。

[Abstract]From the point of application and based on the analysing of the problems,this paper introduced a zip code recognition system which combined the neural networks and the syntax structure method together. This software is written with languege C, it is runing on the 486 PC. The result of testing on 13970 digits minifests that the recognition rate is 98.43% and the refusal rate is 1.43% and the error rate is 0.136%.
[关键字]二值化; 模式识别; 神经网络;



共95页 当前第64页 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95   
©中国中文信息学会 1981-2007
京ICP备05039057号