专题:人工智能评价体系研究

基于排名结果的全球人工智能评价指标体系研究*

展开
  • (1.北京大学信息管理系   北京   100871)
    (2.北京大学档案馆   北京   100871)
王楚涵,女,北京大学信息管理系博士研究生;李广建,男,北京大学信息管理系教授,博士生导师;陈沫,女,北京大学档案馆助理研究员。

收稿日期: 2025-08-08

  网络出版日期: 2025-09-08

基金资助

*本文系国家社会科学基金重大项目“数智转型背景下智能情报关键技术应用研究”(项目编号:23&ZD228)研究成果之一。

Research on Global Artificial Intelligence Evaluation Indicator Systems Based on Ranking Performance

Expand

Received date: 2025-08-08

  Online published: 2025-09-08

摘要

全球人工智能竞争加剧的背景下,各国排名在评价指数中频繁波动,不仅反映出国家实力的变化,更隐含着指数设计逻辑对结果的影响和塑造作用。为深入探究这种影响,文章首先依据代表性、专业性、相关性、透明性、时效性5项原则,遴选出9项主流人工智能评价指数作为研究样本,使用TF-IDF算法进行原始指标归类,得到数据、人才、研究与开发、投资、战略5大共性维度。基于此框架,横向比较了15个主要国家在各维度的得分差异,并纵向追踪各国在不同年份、不同指数中的位次变化。研究发现:国家排名结果的变动是对指标权重调整与评分粒度的响应,指数设计嵌入了明显的技术关切与价值倾向;战略维度对政策信号高度敏感,其他维度变化较平缓。文章提出共用五维结构叠加权重调整与年度滚动机制,可在国际可比与本土适配之间实现动态平衡,为构建通用且灵活的人工智能评估框架提供了方法参考。

本文引用格式

王楚涵 李广建 陈 沫 . 基于排名结果的全球人工智能评价指标体系研究*[J]. 图书与情报, 2025 , 45(04) : 23 -35 . DOI: 10.11968/tsyqb.1003-6938.2025043

Abstract

Against the backdrop of intensifying global artificial intelligence (AI) competition, frequent fluctuations in national rankings across evaluation indices not only reflect shifts in national capabilities but also reveal how index design logic influences and shapes assessment outcomes. To interrogate this phenomenon, this paper first selected nine mainstream AI evaluation indices as research samples based on five principles: representativeness, professionalism, relevance, transparency, and timeliness. Using the TF-IDF algorithm for taxonomic categorization of original indicators, five thematic dimensions were identified: Data Infrastructure, Talent Pool, R&D Capacity, Investment & Deployment, and Strategy & Governance. Employing this framework, we conducted cross-sectional comparisons of scores across these dimensions for 15 leading nations and longitudinal tracking of their rank variations across indices and time periods. Key findings indicate that: Ranking volatility stems primarily from adjustments in indicator weighting schemas and scoring granularity, with indices embedding distinct technological priorities and value orientations; The Strategy & Governance dimension exhibits high reactivity to policy signals, while other dimensions demonstrate greater temporal stability. This research proposes a standardized five-dimensional evaluation architecture incorporating dynamic weight calibration and an annual rolling revision mechanism. This approach achieves dynamic equilibrium between international comparability and contextual adaptation, providing methodological scaffolding for constructing versatile yet responsive AI assessment frameworks.
文章导航

/