Development of a Local Data-Warehouse for database integration and analysis: Uncovering the sequence-activity relations

Author

Novoa Rojas, Yago 

Abstract

Databases have existed since the 60s, with the information boom of the genome project many specific databases came to light, around that time relational databases appeared and have been a staple in the management of data ever since. This thesis aims to utilise a non-relational data warehouse and a relational database server data management system to store and visualize information coming from different heterogeneous databases. The main purpose of the study was the develop and implement a working pipeline to extract and integrate the data as well as to perform screening and analysis routine to the retrieved data as to facilitate its access for the learning community. The integration of heterogeneous databases needed the design or an easily maintainable database and a user-friendly visualisation application. After studying different approaches, PHP and MySQL were chosen for database management and visualisation as both languages provide advantages for their own respective fields of use. The generation of a data warehouse, SQL server and visualisation application were achieved, and is accessible for users.
All the work described in this thesis is a step towards the integration of multiple biological databases into a single unified database by making use of in silico programming and routines of data mining and data integration.

 

Director

Biarnés Fontal, Xavier

Degree

IQS SE - Master’s Degree in Bioengineering

Date

2020-07-14