Metadata Management System for Data Lakes:Requirements Analysis, Functional Architecture, and Future Directions*

Expand

Received date: 2025-01-03

  Online published: 2025-02-25

Abstract

As global data production grows exponentially, traditional data management systems are increasingly challenged by demands for handling massive, diverse, and real-time data. Data lakes, serving as extensive repositories for raw data, have emerged as essential tools for managing data of varying types and scales. To prevent data lakes from deteriorating into data swamps, effective metadata management is crucial. Focusing on the data lifecycle within data lakes, this paper explores metadata management requirements, categorizes types of metadata in data lakes, and provides a comprehensive analysis of metadata architectures across various fields. The study further synthesizes current metadata architectures in data lakes and outlines the core functionalities of metadata management systems, highlighting their critical role in data lake ecosystems. This discussion of data lake operation mechanisms and metadata management logic aims to support the growing data management challenges.

Cite this article

Zhang Guixiang Jia Junzhi Xue Pengzhen . Metadata Management System for Data Lakes:Requirements Analysis, Functional Architecture, and Future Directions*[J]. Library & Information, 2025 , 45(01) : 106 -116 . DOI: 10.11968/tsyqb.1003-6938.2025011

Outlines

/