Co-occurrence-based Thesaurus The general idea underlying the use of term co-occurrence data for thesaurus construction is that words that tend to occur together in documents are likely to have similar, or related, meanings.Co-occurrence data thus provides a statistical method for automatically identifying semantic rela-tionships that are normally contained in a hand-made thesaurus. Predicate-Argument-based Thesaurus This method attempts to construct a thesaurus according to predicate-argument structures. The use of this method for thesaurus construction is based on the idea that there are restrictions on what words can appear in certain environments, and in particular, what words can be arguments of a certain predicate .For example, a cat may walk, bite, but can not fly. Each noun may therefore be characterized according to the verbs or adjectives that it occurs with. Nouns may then be grouped according to the extent to which they appear in similar constructions. Reference: Ad Hoc Retrieval Experiments Using WordNet and Automatically Constructed Thesauri. |
Two methods for Thesaurus Construction
Posted by jeffy
Posted on 7:41 PM
with No comments
loading..
Popular Posts
-
Resources about lucene Resources Introductions The API documentation contains a short and simple code example that show...
-
Just read a post from http://blog.bigml.com/2013/02/21/everything-you-wanted-to-know-about-machine-learning-but-were-too-afraid-to-ask-pa...
-
Repost from http://terrytao.wordpress.com/advice-on-writing-papers/ There are three rules for writing the novel. Unfortunately, no on...
-
汉字编码问题 下面是搜集的多篇关于汉字编码问题文章的合集,相信你的问题一定包含在其中,如果没有请留言,一起把这方面的内容补充全。 一、汉字编码的种类 汉字编码中现在主要用到的有三类,包括GBK,GB2312和Big5。 1、 GB2312又称国标码 ,由国家标准总...
-
This is a very useful list of surveys from Doug Oard , which would be definitely helpful for those who want to enter this territory. I...
-
We examine top Python Machine learning open source projects on Github, both in terms of contributors and commits, and identify most popula...
-
Sent to you by jeffye via Google Reader: (title unknown) via 异度空间——Sue's Cabinet by 苏绥 on 3/10/09 Books on Infor...
-
The problems such as multirow.sty’ not found can be fixed via the following command (Ubuntu system): sudo apt-get install texlive-latex...
-
This a nice post from http://jeroenjanssens.com/2013/09/19/seven-command-line-tools-for-data-science.html . If you play with data, these ...
-
file的这几个取得path的方法各有不同,下边说说详细的区别 概念上的区别:(内容来自jdk,个人感觉这个描述信息,只能让明白的人明白,不明白的人看起来还是有点难度(特别试中文版,英文版稍好些)所以在概念之后我会举例说明。如果感觉看概念很累就跳过直接看例子吧。看完例子回...
0 Comments:
Post a Comment