|Table of Contents|

[1] Wu Chen, Zhang Quan, Jia Ning, et al. Concept-based approach for information retrieval [J]. Journal of Southeast University (English Edition), 2006, 22 (3): 324-329. [doi:10.3969/j.issn.1003-7985.2006.03.007]
Copy

Concept-based approach for information retrieval()
一种基于概念的信息检索方法
Share:

Journal of Southeast University (English Edition)[ISSN:1003-7985/CN:32-1325/N]

Volumn:
22
Issue:
2006 3
Page:
324-329
Research Field:
Computer Science and Engineering
Publishing date:
2006-09-30

Info

Title:
Concept-based approach for information retrieval
一种基于概念的信息检索方法
Author(s):
Wu Chen1 2 Zhang Quan1 Jia Ning1 2
1Graduate School, Chinese Academy of Sciences, Beijing 100039, China
2 Institute of Acoustics, Chinese Academy of Sciences, Beijing 100080, China
吴晨1 2 张全1 贾宁1 2
1中国科学院研究生院, 北京 100039; 2中国科学院声学研究所, 北京 100080
Keywords:
information retrieval concept semantic knowledge content representation
信息检索 概念 语义知识 内容表示
PACS:
TP393
DOI:
10.3969/j.issn.1003-7985.2006.03.007
Abstract:
A concept-based approach is expected to resolve the word sense ambiguities in information retrieval and apply the semantic importance of the concepts, instead of the term frequency, to representing the contents of a document.Consequently, a formalized document framework is proposed.The document framework is used to express the meaning of a document with the concepts which are expressed by high semantic importance.The framework consists of two parts:the “domain” information and the “situation & background” information of a document.A document-extracting algorithm and a two-stage smoothing method are also proposed.The quantification of the similarity between the query and the document framework depends on the smoothing method.The experiments on the TREC6 collection demonstrate the feasibility and effectiveness of the proposed approach in information retrieval tasks.The average recall level precision of the model using the proposed approach is about 10% higher than that of traditional ones.
为了获取词语在文章中的语义权重, 解决词语的同义、多义模糊问题, 提升信息检索的效率, 提出了一种基于概念的检索模型, 模型中设计了一种形式化的文本内容表示框架, 框架由2部分构成:文章的“领域”以及“情景与背景”信息, 并由概念(形式化语义)加以表示.同时, 提出了提取该概念框架的方法, 给出了用于框架与检索要求间匹配的两阶段平滑算法.实验表明, 在TREC6提供的小规模语料集下, 采用所提出方法的信息检索模型与传统模型相比, 平均召回准确率提升了约10%, 效果显著, 充分说明了基于本文描述方法构建的、以概念作为处理中介的信息检索系统的有效性和可行性.

References:

[1] Salton G.Automatic information organization and retrieval [M].New York:McGraw-Hill, 1968.
[2] Crestani F, Pasi G.Soft computing in information retrieval [M].Germany:Physica Verlag and Co, 2000.102-121.
[3] Lalmas M.Logical models in information retrieval:introduction and overview [J].Information Processing and Management, 1998, 34(1):19-33.
[4] Miyamoto S.Fuzzy sets in information retrieval and clustering analysis [M].Kluwer Academic Press, 1990.
[5] Salton G, Buckley C.Term-weighting approaches in automatic text retrieval [J].Information Processing and Management, 1988, 24(5):513-523.
[6] Huang Zengyang.HNC (hierarchical network concept)theory [M].Beijing:Tsinghua University Press, 1998.(in Chinese)
[7] Schank R.Identification of conceptualizations underlying nature language [A].In:Schank R, Colby K, eds.Computer Models of Thought and Language[C].San Francisco, CA:W H Freeman Company, 1973.187-247.
[8] Schank R.Conceptual information processing[M].Amsterdam:North Holland, 1975.
[9] Huang Zengyang.Mathematics and physics symbol system of language in language concept space [M].Beijing:Ocean Press, 2004.(in Chinese)
[10] Miao Chuanjiang.Guide of HNC (hierarchical network concept)theory [M].Beijing:Tsinghua University Press, 2005.(in Chinese)
[11] Wei Xiangfeng.The software platform for expanded sentence category analysis based on the HNC theory [D].Beijing:Chinese Academy of Sciences, 2005.http://www.hncnlp.com/Abs/absEwxf.htm.(in Chinese)
[12] Zhai C, Lafferty J.Two-stage language models for information retrieval [A].In:Proceedings of the 25th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval[C].Tampere, Finland, 2002.49-56.

Memo

Memo:
Biographies: Wu Chen(1979—), male, graduate;Zhang Quan(corresponding author), male, doctor, professor, zhq@mail.ioa.ac.cn.
Last Update: 2006-09-20