信息技术与系统

基于深度学习的主题资源监测采集功能实现研究* 

  • 刘艳民 张旺强 祝忠明 陈宏东
展开
  • 1.兰州大学图书馆
    2.中国科学院兰州文献情报中心
刘艳民,女,兰州大学图书馆馆员,研究方向:信息技术;张旺强,男,中国科学院兰州文献情报中心馆员,研究方向:知识管理系统构建;祝忠明,男,中国科学院兰州文献情报中心研究员,研究方向:知识管理系统建设;陈宏东,男,兰州大学图书馆副研究馆员,研究方向:信息管理。

收稿日期: 2019-04-19

  网络出版日期: 2019-06-12

基金资助

*本文系中科院兰州文献情报中心情报创新能力建设项目“基于词向量模型深度学习的主题资源检测平台构建研究”(项目编号:Y7AJ012007)研究成果之一。

Research on the Realization of Theme Resource Monitoring and Collection Function Based on Deep Learning

  • Liu Yanmin Zhang Wangqiang Zhu Zhongming Chen Hongdong
Expand

Received date: 2019-04-19

  Online published: 2019-06-12

摘要

:文章构建了基于深度学习的主题资源监测采集模型,并利用深度学习词向量工具word2vec对收集的语料进行深度训练,对采集资源与主题模型进行相似度匹配,通过设定合适阈值来实现自动化监测主题资源。实践证明:基于深度学习的定主题监测方法在海洋战略研究所信息监测系统的应用过程中,在主题资源自动监测的准确性上效果优于传统基于向量空间模型的监测算法,能为专题知识库和领域情报信息监测系统的构建打下坚实的基础。

本文引用格式

刘艳民 张旺强 祝忠明 陈宏东 . 基于深度学习的主题资源监测采集功能实现研究* [J]. 图书与情报, 2019 , 39(02) : 133 -140 . DOI: 10.11968/tsyqb.1003-6938.2019035

Abstract

Theme open knowledge resource acquisition is usually realized by intelligence personnel through fixed-source and fixed-point data acquisition. But in the age of big data, the number of open access information resources has increased dramatically. In order to improve the accuracy and recall rate of automatic monitoring and collection of theme-related resources,to reduce intelligence personnel workload, the latest achievements of deep learning technology is introduced in the field of artificial intelligence. A theme resource monitoring and collection model based on deep learning is proposed. The word vector tool word2vec was used to train the collected corpus in depth. Similarity matching is conducted between theme crawler collection resources and theme model. The practice proves that the thematic monitoring method based on deep learning proposed in this paper is applied to the information monitoring system of the institute of ocean strategy. The accuracy of subject resource automatic monitoring is better than that of traditional detection algorithms.
文章导航

/