From 592548cca68eb4983dfb31472d3215bd44fcbb8c Mon Sep 17 00:00:00 2001 From: Andrea Alvarez Perez Date: Thu, 3 Apr 2025 11:45:33 +0000 Subject: [PATCH] Add README.md --- data/raw/README.md | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) create mode 100644 data/raw/README.md diff --git a/data/raw/README.md b/data/raw/README.md new file mode 100644 index 0000000..60e99f1 --- /dev/null +++ b/data/raw/README.md @@ -0,0 +1,19 @@ +# RAW DATA + +This directory contains the data referring to the nodes and links used in this research. + +## Single cell RNA-seq original dataset + +| Data | Description | +|-------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| cross-dementia.h5ad | Single cell RNA-seq dataset from 432,555 single cells, in h5ad format for subsequent differential expression analysis, publicly available in [CellXGene official download link](https://cellxgene.cziscience.com/collections/c53573b2-eff4-4c5e-9ad0-b24d422dfd9b) and Synapse (syn52074156). It includes samples from Alzheimer's disease, frontotemporal dementia and progressive supranuclear palsy, and normal controls, with each cell expressing 29,968 genes, collected from three different brain regions and classified into 9 cell types. | + + +## Biological entities and associations + +| Data | Description | Identifier | Total | Source | ACCESSED DATE | +|-----------------------------------|------------------------------------------------------------------------------------|----------------------------------------------------------------|--------|-----------------------------------------------------------------------------------|----------------------------| +| [Genes (gen.tsv)](https://medal.ctb.upm.es/internal/gitlab/disnet/network-medicine/network-medicine-and-single-cell-for-alzheimer/new/master/data/raw/gen.tsv) | Data relating to genes, including their symbol and identifier | National Center of Biotechnology Information (NCBI) Identifiers | 26,181 | [NCBI](https://www.ncbi.nlm.nih.gov/) | May 2024 | +| Disease – Gene (dis_gen.tsv) | Associations between diseases and genes whose mutation triggers the disease. Due to its size, the file is stored in https://drive.upm.es/s/gd1Hw0PD3DXH8BQ | UMLS CUI – NCBI Identifier | 1,045,745 | [DisGeNET](https://www.disgenet.org/) | May 2024 | +| [Gene – Protein (gen_pro.tsv)](https://medal.ctb.upm.es/internal/gitlab/disnet/network-medicine/network-medicine-and-single-cell-for-alzheimer/new/master/data/raw/gen_pro.tsv) | Associations between genes and proteins produced from the gene | NCBI Identifier – Accession Number in UniProt | 16,460 | [DisGeNET](https://www.disgenet.org/) | May 2024 | +| Protein – Protein (pro_pro.tsv) | Associations between proteins that physically interact with each other | Accession number in UniProt – Accession number in UniProt | 439,863| [DisGeNET](https://www.disgenet.org/) | May 2020 | -- 2.24.1