Introduction

 Big Data has significantly augmented research endeavors across various fields, with healthcare research benefiting greatly. By leveraging the power of Big Data and predictive analytics, researchers can perform sophisticated analyses and generate actionable insights. The three Vs of Big Data (volume, veracity, and variety) have transformed healthcare research to a new level [1]. Massive amounts of data are generated at high frequency, containing an array of attributes. It is no longer surprising that sophisticated data science algorithms can be implemented on open-source platforms, with many free training resources available to the research community [2].

All research projects begin with a research question, which lays the groundwork to develop and implement a detailed and conclusive analysis. In the conception phase, the viability of the research project is determined, and the dataset is chosen. This step also gives us the ability to realize study limitations, which are generally due to the availability of data in terms of timeline, predictors, geography, and other reasons. The conception phase has many implications for the project’s final result, including but not limited to the organization and selection of the suitable dataset, developing a data analysis strategy that encompasses data cleaning, preprocessing, modeling, and ultimately presenting the research findings persuasively [3]. This study aims to introduce the OnetoMap meta-data repository, a centralized inventory developed to enhance healthcare research by providing detailed descriptions of various datasets, enabling efficient dataset linkage, and promoting collaborative research.