Leveraging hybrid database models for enhanced gene-disease association analysis

Sama Salam Samaan; Saja Dheyaa Khudhur; Omar Nowfal Mohammed Tahe

doi:10.47831/mjpas.v3i2.325

pdf

Published

2025-03-30

Issue

Vol. 3 No. 2 (2025)

Section

Articles

License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Leveraging hybrid database models for enhanced gene-disease association analysis

Sama Salam Samaan

Saja Dheyaa Khudhur

Omar Nowfal Mohammed Tahe

Computer Engineering Department, University of Technology

DOI: https://doi.org/10.47831/mjpas.v3i2.325

Keywords: GDA, graph database, semi-structured data, TBGA

Abstract

Many diseases are driven by genetic variations. The Gene-Disease Association (GDA) dataset, structured as a network, evaluates the relationships between genes and diseases. Typically, the GDA dataset consists of semi-structured data, which does not conform to a tabular format. In this work, we propose a hybrid approach for processing, storing, and analyzing TBGA, a GDA dataset comprising over 200,000 JSON instances and 100,000 gene-disease pairs. We introduce two procedures to import the TBGA dataset into both a relational model and a graph model. SQL Server is employed for the relational model to support analytical and reporting tasks, while Neo4j is used for the graph model to enable visualization and the application of graph algorithms tailored for GDA analysis. Experimental results demonstrate the effectiveness of each model, with SQL Server excelling in analytical tasks and Neo4j in visualization and graph analysis.

Mustansiriyah Journal of Pure and Applied Sciences

Published

Issue

Section

Categories

License

Leveraging hybrid database models for enhanced gene-disease association analysis

Abstract

Information

Browse