diff --git a/data/README.md b/data/README.md index b9c1967baac5db829cf3d487f6be51fdf2f3411c..f3ab5ac99c13a2f5f990b37e267a310f797628a1 100644 --- a/data/README.md +++ b/data/README.md @@ -3,23 +3,21 @@ This directory contains the data referring to the nodes and links used to genera ## Nodes -| DATA | DESCRIPTION | IDENTIFIER | TOTAL | SOURCE | -|------------------|--------------------------- ----------------------------------|-------------- -------------------------------------------|------- -|-----------| -| Diseases (dis.tsv) | Data regarding diseases, including your name and identifier | Unified Medical Language System (UMLS) Concept Unique Identifiers (CUI) | 30,731 | UMLS | -| Genes (gen.tsv) | Data relating to genes, including their symbol and identifier | National Center of Biotechnology Information (NCBI) Identifiers | 20,610 | NCBI | -| Proteins (prot.tsv) | Data relating to proteins, including their identifier | Accession number in UniProt | 18,521 | UniProt | -| Drugs (dru.tsv) | Data relating to drugs, including your name and identifier | ChEMBL Identifier | 3,944 | ChEMBL | +| DATA | DESCRIPTION | IDENTIFIER | TOTAL | SOURCE | +|-------------------------|-----------------------------------------------------------------|-------------------------------------------------|--------|---------| +| Diseases (dis.tsv) | Data regarding diseases, including their name and identifier | Unified Medical Language System (UMLS) Concept Unique Identifiers (CUI) | 30,731 | UMLS | +| Genes (gen.tsv) | Data relating to genes, including their symbol and identifier | National Center of Biotechnology Information (NCBI) Identifiers | 20,610 | NCBI | +| Proteins (prot.tsv) | Data relating to proteins, including their identifier | Accession number in UniProt | 18,521 | UniProt | +| Drugs (dru.tsv) | Data relating to drugs, including their name and identifier | ChEMBL Identifier | 3,944 | ChEMBL | ## Links -| DATA | DESCRIPTION | IDENTIFIER | TOTAL | SOURCE | -|-----------------------|------------------------ -------------------------------------------------- ---------------|---------------------------------- --------------------|---------|------------------- -------------------------------------------------- ----| -| Disease – Drug (dis_dru_the.tsv) | Associations between diseases and drugs used for their treatment | UMLS CUI – ChEMBL Identifier | 52,179 | Comparative Toxicogenomics Database (CTD) | -| Disease – Gene (dis_gen.tsv) | Associations between diseases and genes whose mutation triggers the disease | UMLS CUI – NCBI Identifier | 358,209 | DisGeNET | -| Disease – Protein (dis_prot.tsv) | Associations between diseases and proteins produced from their pathological genes | UMLS CUI – Accession number in UniProt | 361,325 | DisGeNET | -| Gene – Protein (gen_pro.tsv) | Associations between genes and proteins produced from the gene | NCBI Identifier – Accession Number in UniProt | 15,770 | DisGeNET | -| Protein – Protein (pro_pro.tsv) | Associations between proteins that physically interact with each other | Accession number in UniProt – Accession number in UniProt | 439,863 | DisGeNET | -| Drug – Protein (dru_pro.tsv) | Associations between drugs and the target proteins they affect | ChEMBL identifier – Accession number in UniProt | 5,946 | ChEMBL and DrugBank | -| Disease – Symptom (dse_sym.tsv) | Associations between diseases and the symptoms they develop | UMLS CUI – UMLS Concept Unique Identifiers | 318,550 | ChEMBL and (Side Effect Resource) SIDER | - +| DATA | DESCRIPTION | IDENTIFIER | TOTAL | SOURCE | +|--------------------------------|------------------------------------------------------------------------------------|----------------------------------------------------------------|--------|----------------------------------------------------| +| Disease – Drug (dis_dru_the.tsv) | Associations between diseases and drugs used for their treatment | UMLS CUI – ChEMBL Identifier | 52,179 | Comparative Toxicogenomics Database (CTD) | +| Disease – Gene (dis_gen.tsv) | Associations between diseases and genes whose mutation triggers the disease | UMLS CUI – NCBI Identifier | 358,209| DisGeNET | +| Disease – Protein (dis_prot.tsv) | Associations between diseases and proteins produced from their pathological genes | UMLS CUI – Accession number in UniProt | 361,325| DisGeNET | +| Gene – Protein (gen_pro.tsv) | Associations between genes and proteins produced from the gene | NCBI Identifier – Accession Number in UniProt | 15,770 | DisGeNET | +| Protein – Protein (pro_pro.tsv) | Associations between proteins that physically interact with each other | Accession number in UniProt – Accession number in UniProt | 439,863| DisGeNET | +| Drug – Protein (dru_pro.tsv) | Associations between drugs and the target proteins they affect | ChEMBL identifier – Accession number in UniProt | 5,946 | ChEMBL and DrugBank |