Leveraging hybrid database models for enhanced gene-disease association analysis
DOI:
https://doi.org/10.47831/mjpas.v3i2.325Keywords:
GDA, graph database, semi-structured data, TBGAAbstract
Many diseases are driven by genetic variations. The Gene-Disease Association (GDA) dataset, structured as a network, evaluates the relationships between genes and diseases. Typically, the GDA dataset consists of semi-structured data, which does not conform to a tabular format. In this work, we propose a hybrid approach for processing, storing, and analyzing TBGA, a GDA dataset comprising over 200,000 JSON instances and 100,000 gene-disease pairs. We introduce two procedures to import the TBGA dataset into both a relational model and a graph model. SQL Server is employed for the relational model to support analytical and reporting tasks, while Neo4j is used for the graph model to enable visualization and the application of graph algorithms tailored for GDA analysis. Experimental results demonstrate the effectiveness of each model, with SQL Server excelling in analytical tasks and Neo4j in visualization and graph analysis.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Sama Salam Samaan , Saja Dheyaa Khudhur, Omar Nowfal Mohammed Tahe

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.