README.md 2.19 KB
Newer Older
Laura Masa's avatar
Laura Masa committed
1 2
Author: Laura Masa Martínez
This repository documents the work conducted for my master’s thesis titled "Analyzing Gene Expression Datasets for Disease Annotation Modeling"
Laura Masa's avatar
Laura Masa committed
3

Laura Masa's avatar
Laura Masa committed
4 5
A visual summary of the entire workflow and main steps of the project is shown below.

Laura Masa's avatar
Laura Masa committed
6 7

**Objectives**
Laura Masa's avatar
Laura Masa committed
8 9 10 11
The principal objective of this thesis was to design and implement an automated system for the extraction, processing, and analysis of gene-disease association data. This system was developed to generate a detailed gene-disease annotation model, including gene identifiers, gene expression profiles, and comprehensive metadata. The automation of these processes was intended to provide a structured and efficient platform to support biomedical research efforts and the identification of novel therapeutic targets.

In addition to the main objective, the thesis aimed to achieve several secondary goals:

Laura Masa's avatar
Laura Masa committed
12 13 14 15 16 17 18 19 20 21 22 23 24 25
1.  Differential Gene Expression Analysis: To create a Personalized Perturbation Profile (PEEP) that captures gene expression variations for individual and group-level comparisons.
2.  Advancing Personalized Medicine: To leverage PEEP profiles for discovering tailored therapeutic interventions and exploring opportunities for drug repositioning for novel therapeutic applications.
3.  Modeling Disease-Gene Associations: To enhance the semantic understanding and predictive accuracy of gene-disease relationships through advanced modeling techniques.


**Folders structure**

| Folder | Content |
| ------ | ------ |
| data_processing | Code and datasets associated with the initial phase of the project. It includes resources for **collecting gene expression data**, **preprocessing the data**, and **selecting relevant subsets** for subsequent analysis. |
| data_analysis | Code, data, and figures used for the **analysis and visualization of gene expression data**. This includes performing **differential gene expression analysis**, generating descriptive statistics, and visualizing the results of both group-wise and individual-level gene regulation studies. | 
| analysis_drug_repurposing | Code and data used for analyzing drug-target relationships and visualizing potential treatments based on the findings from the gene-disease association data. |