Commit 06d20565 authored by Andrea's avatar Andrea

Merge remote-tracking branch 'origin/master'

parents ec93de06 be40d150
# Decoding cell-type-specific alterations in Alzheimer's disease through scRNA-seq and network analysis
**Authors**
Andrea Álvarez-Pérez, Lucía Prieto-Santamaría, Alejandro Rodríguez-González
## Objective
The present study examines whether gene expression patterns in Alzheimer's disease vary by cell type compared to healthy individuals. Using DEGs from scRNA-seq, we built cell-type-specific PPIs for eight major brain cell types.
## Structure of the repository
### Code
Scripts and jupyter notebooks employed to carry out this investigation.
### Data
Directory that contains the initial data used as the starting point for this study, as well as the intermediate files and the results generated during the analysis.
### Figures
Images obtained as a result of the research.
# CODE AND ANALYSIS
## Repository content
| Script name | Usage |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------|
| [scrna_ppi_analysis.ipynb](https://medal.ctb.upm.es/internal/gitlab/disnet/network-medicine/network-medicine-and-single-cell-for-alzheimer/new/master/code/scrna_ppi_analysis.ipynb) | Jupyter notebook used to load the h5ad data, perform the data preprocessing, the differential expression analysis and the construction of the cell-type-specific PPI subnetworks overlapping the Alzheimer's disease main disease module. In addition, subsequent analysis and plots generation were computed.|
### Functions folder
| Script name | Used in: | Usage |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------|
| [functions.py](https://medal.ctb.upm.es/internal/gitlab/disnet/network-medicine/network-medicine-and-single-cell-for-alzheimer/new/master/code/functions/functions.py) | [scrna_ppi_analysis.ipynb](https://medal.ctb.upm.es/internal/gitlab/disnet/network-medicine/network-medicine-and-single-cell-for-alzheimer/new/master/code/scrna_ppi_analysis.ipynb) | Script with the functions employed in the obtention of the Alzheimer's disease module, the computation of the random modules for the statistical validation of each cell-type-specific PPI network and the representation of results. |
\ No newline at end of file
# DATA
## Structure of the folders
| Subfolder name | Description |
|---------------------------|------------------------------------------------------------------|
| disnet | Data from DISNET database which stores biological entities relationships (disease-gene, gene-protein...) used for the computation of disease modules and gene-protein mappings. |
| complete | Data obtained from the computation of differential expression analysis and construction of cell-type-specific PPI networks using all DEGs identified from the original dataset. |
| filtered | Data obtained from the complete datasets but keeping only those DEGs which encoded proteins belong to the main Alzheimer's disease module. |
| results | Data results obtained from the analysis. |
\ No newline at end of file
# COMPLETE DATA
This repository stores the data obtained from the original h5ad dataset after the differential expression analysis.
## Structure of the folders
| File name | Description |
|---------------------------------------|------------------------------------------------------------------|
| degs_{cell type}_total.csv | Stores all differentially expressed genes (DEGs) found per each cell type, identified by the ENSEMBL ID. Includes values of the logfoldchange, p-value, adjusted p-value and scores for each DEG. |
| degs_{cell type}_mapped.csv | Stores all the differentially expressed genes (DEGs) for each cell type, identified by Protein Accession Number, Gene Entrez ID, gene symbol, and ENSEMBL ID. Only those DEGs which encode proteins were kept after these mapping step. Includes values of the logfoldchange, p-value, adjusted p-value and scores for each DEG. |
| graphs/{cell_type}_network.graphml | Stores a cell-type-specific PPI network constructed with all the DEGs which were mapped to proteins, independently of their overlap with the main Alzheimer's disease module. Each node is identified with the protein_id, and is annotated with the corresponding DEG's logfoldchange, p-value and adjusted p-value. |
\ No newline at end of file
# FILTERED DATA
This repository stores the data obtained from the intersection of the complete DEGs dataset mapped to proteins with the main Alzheimer's disease module.
## Structure of the folders
| File name | Description |
|---------------------------------------|------------------------------------------------------------------|
| degs_{cell type}_mapped_filt.csv | Stores all the differentially expressed genes (DEGs) for each cell type belonging to the main Alzheimer's disease module, identified by Protein Accession Number, Gene Entrez ID, gene symbol, and ENSEMBL ID. Only those DEGs which encode proteins were kept after these mapping step. Includes values of the logfoldchange, p-value, adjusted p-value and scores for each DEG. |
| graphs/{cell_type}_network.graphml | Stores a cell-type-specific PPI network constructed with the DEGs which were mapped to proteins belonging to the main Alzheimer's disease module. Each node is identified with the protein_id, and is annotated with the corresponding DEG's logfoldchange, p-value and adjusted p-value. |
\ No newline at end of file
# ANALYSIS RESULTS
## Data summary
| File | Nº columns | Nº rows | Description |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|---------|-------------------------------------------------------------------------------------------------------------------------------------------------------|
| [cell_type_summary_stats.tsv](https://medal.ctb.upm.es/internal/gitlab/disnet/network-medicine/network-medicine-and-single-cell-for-alzheimer/new/master/data/results/cell_type_summary_stats.tsv) | 10 | 9 | Contains the total number of DEGs, how many of them encoded known proteins, the number of these proteins included within the main Alzheimer's disease module, the cell-type-specific PPI subnetwork module size and its statistical significance by selecting node sets with the same degree distribution as the seed genes, repeating this process 1,000 times. A z-score was calculated, followed by a p-value and an FDR-corrected adjusted p-value. |
| [network_analysis.csv](https://medal.ctb.upm.es/internal/gitlab/disnet/network-medicine/network-medicine-and-single-cell-for-alzheimer/new/master/data/results/network_analysis.csv) | 12 | 3,240 | Contains all proteins belonging to at least one cell-type-specific PPI subnetwork, the corresponding Protein Accession Number, Gene Entrez ID, Gene Symbol, Cell type, logfoldchange of the corresponding DEGs, p-value, adjusted p-value, and the value of several network metrics (degree, betweenness centrality, closeness centrality, clustering coefficient, is hub) measured for the full interactome. |
| [G_ppi_analysis.csv](https://medal.ctb.upm.es/internal/gitlab/disnet/network-medicine/network-medicine-and-single-cell-for-alzheimer/new/master/data/results/G_ppi_analysis.csv) | 12 | 16,928 | Contains all the unique proteins present in the interactome and the calculated values of several network metrics (degree, betweenness centrality, closeness centrality, clustering coefficient, is hub) and a flag is_in_LCC which is True when the protein is present in the main Alzheimer's disease module. |
\ No newline at end of file
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment