Study on Topic Modeling of Artificial Intelligence Application Scenario Based on BERTopic

Expand

Received date: 2025-06-13

  Online published: 2025-07-23

Abstract

Against the backdrop of China's vigorous promotion of AI (Artificial Intelligence) application scenarios, this study employs BERTopic to examine topic patterns in AI deployment contexts. Initially, 3,524 news articles were collected from The Paper (Pengpai News) and preprocessed for analysis. For topic modeling, the Conan-embedding-v1 pre-trained large model was utilized for text embedding, followed by dimensionality reduction via UMAP, clustering through HDBSCAN, and topic representation using c-TF-IDF. Topic keywords were further refined through KeyBERT-based optimization techniques. In topic analysis, keyword distributions were examined across 11 domains: technological R&D, cultural digitization, regional economic collaboration, economic development, financial innovation, capital markets, healthcare/elderly care, policy coordination, low-altitude economy, urban development, and news dissemination. Similarity analysis revealed strong inter-topic correlations: urban development demonstrated high similarity with economic development, financial innovation, and policy coordination; while economic development showed pronounced alignment with financial innovation and policy coordination. Hierarchical clustering and document distribution analysis indicated varying degrees of cross-domain integration between technological R&D, cultural digitization, and policy coordination with other topic areas. This research, to a certain extent, elucidates the current landscape, latent demands, and interconnected elements of disruptive applications of AI.

Cite this article

Liu Xiangbei Yan Yalan Zha Xianjin . Study on Topic Modeling of Artificial Intelligence Application Scenario Based on BERTopic[J]. Library & Information, 2025 , 45(03) : 46 -55 . DOI: 10.11968/tsyqb.1003-6938.2025032

Outlines

/