README.md 3.06 KB
Newer Older
Maria Marin's avatar
Maria Marin committed
1
# Data
Maria Marin's avatar
Maria Marin committed
2
This directory contains the data referring to the nodes and links used in this research.
Maria Marin's avatar
Maria Marin committed
3 4 5

## Nodes

Maria Marin's avatar
Maria Marin committed
6 7 8 9 10 11
| DATA                    | DESCRIPTION                                                     | IDENTIFIER                                      | TOTAL  | SOURCE  |
|-------------------------|-----------------------------------------------------------------|-------------------------------------------------|--------|---------|
| Diseases (dis.tsv)     | Data regarding diseases, including their name and identifier     | Unified Medical Language System (UMLS) Concept Unique Identifiers (CUI) | 30,731 | UMLS    |
| Genes (gen.tsv)        | Data relating to genes, including their symbol and identifier   | National Center of Biotechnology Information (NCBI) Identifiers | 20,610 | NCBI    |
| Proteins (prot.tsv)    | Data relating to proteins, including their identifier           | Accession number in UniProt                   | 18,521 | UniProt |
| Drugs (dru.tsv)        | Data relating to drugs, including their name and identifier     | ChEMBL Identifier                             | 3,944  | ChEMBL  |
Maria Marin's avatar
Maria Marin committed
12 13 14 15


## Links

Maria Marin's avatar
Maria Marin committed
16 17 18 19 20 21 22 23
| DATA                           | DESCRIPTION                                                                        | IDENTIFIER                                                     | TOTAL  | SOURCE                                             |
|--------------------------------|------------------------------------------------------------------------------------|----------------------------------------------------------------|--------|----------------------------------------------------|
| Disease – Drug (dis_dru_the.tsv)  | Associations between diseases and drugs used for their treatment                 | UMLS CUI – ChEMBL Identifier                                  | 52,179 | Comparative Toxicogenomics Database (CTD)           |
| Disease – Gene (dis_gen.tsv)      | Associations between diseases and genes whose mutation triggers the disease      | UMLS CUI – NCBI Identifier                                    | 358,209| DisGeNET                                           |
| Disease – Protein (dis_prot.tsv)  | Associations between diseases and proteins produced from their pathological genes | UMLS CUI – Accession number in UniProt                        | 361,325| DisGeNET                                           |
| Gene – Protein (gen_pro.tsv)      | Associations between genes and proteins produced from the gene                     | NCBI Identifier – Accession Number in UniProt                | 15,770 | DisGeNET                                           |
| Protein – Protein (pro_pro.tsv)   | Associations between proteins that physically interact with each other              | Accession number in UniProt – Accession number in UniProt    | 439,863| DisGeNET                                           |
| Drug – Protein (dru_pro.tsv)      | Associations between drugs and the target proteins they affect                    | ChEMBL identifier – Accession number in UniProt              | 5,946  | ChEMBL and DrugBank                               |