Commit a6f7364c authored by Belen Otero Carrasco's avatar Belen Otero Carrasco

code and data

parent a198770c
This diff is collapsed.
disease_id drug_id
0 C0023418 CHEMBL109
1 C0346629 CHEMBL118
2 C0006142 CHEMBL118
3 C0002395 CHEMBL1201581
4 C0024623 CHEMBL1201585
5 C0019693 CHEMBL129
6 C0009319 CHEMBL1370
7 C0028754 CHEMBL1419
8 C0006142 CHEMBL1431
9 C0376358 CHEMBL1431
10 C0919267 CHEMBL1434
11 C1140680 CHEMBL1434
12 C0020538 CHEMBL1509
13 C0003873 CHEMBL1535
14 C0002395 CHEMBL1581
15 C0007134 CHEMBL1908360
16 C0279702 CHEMBL1908360
17 C0006142 CHEMBL34259
18 C0003873 CHEMBL34259
19 C0006826 CHEMBL364713
20 C0041671 CHEMBL405
21 C0030297 CHEMBL535
22 C0006826 CHEMBL584
23 C0030567 CHEMBL589
24 C0002395 CHEMBL659
25 C0020538 CHEMBL779
26 C0376358 CHEMBL779
27 C0238198 CHEMBL941
28 C0376358 CHEMBL960
29 C0011570 CHEMBL1175
30 C0003873 CHEMBL118
31 C0003873 CHEMBL1201572
32 C0006142 CHEMBL125
33 C0006826 CHEMBL129
34 C0003873 CHEMBL1366
35 C0011570 CHEMBL1419
36 C0011849 CHEMBL1431
37 C0020538 CHEMBL1581
38 C0020538 CHEMBL1737
39 C0011570 CHEMBL2110900
40 C0014544 CHEMBL220492
41 C0011570 CHEMBL259209
42 C0020538 CHEMBL267936
43 C0020538 CHEMBL317094
44 C0006826 CHEMBL34259
45 C0011570 CHEMBL41
46 C0006826 CHEMBL428647
47 C0030567 CHEMBL53
48 C0238198 CHEMBL535
49 C0007134 CHEMBL535
50 C0279702 CHEMBL535
51 C0020538 CHEMBL589
52 C0020538 CHEMBL597
53 C0030567 CHEMBL641
54 C0007222 CHEMBL779
55 C0020538 CHEMBL802
56 C0006142 CHEMBL81
57 C0376358 CHEMBL81
58 C0011570 CHEMBL894
59 C0023473 CHEMBL941
60 C0003873 CHEMBL960
This diff is collapsed.
This diff is collapsed.
disease_id drug_id pathway_id
C0002395 CHEMBL502 1
C0002892 CHEMBL2103737 1
C0002892 CHEMBL2110563 1
C0020538 CHEMBL1014 2
C0020538 CHEMBL1017 2
C0020538 CHEMBL1069 2
C0020538 CHEMBL1165 2
C0020538 CHEMBL1168 2
C0020538 CHEMBL1200692 2
C0020538 CHEMBL1237 2
C0020538 CHEMBL1393 1
C0020538 CHEMBL1480 1
C0020538 CHEMBL1513 2
C0020538 CHEMBL1519 2
C0020538 CHEMBL1560 2
C0020538 CHEMBL1581 2
C0020538 CHEMBL1592 2
C0020538 CHEMBL1639 2
C0020538 CHEMBL191 2
C0020538 CHEMBL3039598 2
C0020538 CHEMBL431 2
C0020538 CHEMBL515606 2
C0020538 CHEMBL577 2
C0020538 CHEMBL578 2
C0020538 CHEMBL611 1
C0020538 CHEMBL813 2
C0020538 CHEMBL838 2
C0025202 CHEMBL1229517 1
C0026769 CHEMBL1434 1
C0026769 CHEMBL25 1
C0030567 CHEMBL1175 1
C0030567 CHEMBL1201203 1
C0030567 CHEMBL1201236 1
C0030567 CHEMBL1373 1
C0030567 CHEMBL502 1
C0030567 CHEMBL59 1
C0030567 CHEMBL887 1
C0033578 CHEMBL806 1
C0035579 CHEMBL1040 1
C0035579 CHEMBL1042 1
C0035579 CHEMBL1536 1
C0035579 CHEMBL2356023 1
C0035579 CHEMBL846 1
C0376358 CHEMBL1082407 1
C0376358 CHEMBL254328 1
C0376358 CHEMBL3183409 1
This diff is collapsed.
disease_id drug_id pathway_id gene_id
0 C0020538 CHEMBL1581 WP554 1636
1 C0020538 CHEMBL1581 WP4756 1636
disease_id drug_id pathway_id gene_id
0 C0020538 CHEMBL578 WP554 1636
1 C0020538 CHEMBL578 WP4756 1636
2 C0020538 CHEMBL577 WP554 1636
3 C0020538 CHEMBL577 WP4756 1636
4 C0020538 CHEMBL813 WP554 185
5 C0020538 CHEMBL813 WP4756 185
6 C0020538 CHEMBL1480 WP554 4306
7 C0020538 CHEMBL1639 WP554 5972
8 C0020538 CHEMBL1639 WP4756 5972
9 C0033578 CHEMBL806 WP3981 367
10 C0020538 CHEMBL3039598 WP554 1636
11 C0020538 CHEMBL3039598 WP4756 1636
12 C0020538 CHEMBL1513 WP554 185
13 C0020538 CHEMBL1513 WP4756 185
14 C0030567 CHEMBL59 WP2371 6531
15 C0020538 CHEMBL1237 WP554 1636
16 C0020538 CHEMBL1237 WP554 5972
17 C0020538 CHEMBL1237 WP4756 1636
18 C0020538 CHEMBL1237 WP4756 5972
19 C0020538 CHEMBL191 WP554 185
20 C0020538 CHEMBL191 WP4756 185
21 C0020538 CHEMBL1165 WP554 1636
22 C0020538 CHEMBL1165 WP4756 1636
23 C0020538 CHEMBL1200692 WP554 185
24 C0020538 CHEMBL1200692 WP4756 185
25 C0020538 CHEMBL1581 WP554 1636
26 C0020538 CHEMBL1581 WP4756 1636
27 C0020538 CHEMBL1592 WP554 1636
28 C0020538 CHEMBL1592 WP4756 1636
29 C0020538 CHEMBL1168 WP554 1636
30 C0020538 CHEMBL1168 WP4756 1636
31 C0020538 CHEMBL431 WP554 1636
32 C0020538 CHEMBL431 WP4756 1636
33 C0020538 CHEMBL1393 WP554 1585
34 C0020538 CHEMBL1393 WP554 4306
35 C0020538 CHEMBL1017 WP554 185
36 C0020538 CHEMBL1017 WP4756 185
37 C0020538 CHEMBL1519 WP554 1636
38 C0020538 CHEMBL1519 WP4756 1636
39 C0020538 CHEMBL1069 WP554 185
40 C0020538 CHEMBL1069 WP4756 185
41 C0035579 CHEMBL1042 WP1531 7421
42 C0020538 CHEMBL838 WP554 1636
43 C0020538 CHEMBL838 WP4756 1636
44 C0030567 CHEMBL1201203 WP2371 6531
45 C0030567 CHEMBL887 WP2355 596
46 C0020538 CHEMBL611 WP554 7040
47 C0376358 CHEMBL254328 WP3981 367
48 C0002892 CHEMBL2110563 WP1533 4548
49 C0002892 CHEMBL2103737 WP1533 4548
50 C0376358 CHEMBL1082407 WP3981 367
51 C0035579 CHEMBL1040 WP1531 7421
52 C0020538 CHEMBL1014 WP554 185
53 C0020538 CHEMBL1014 WP4756 185
54 C0020538 CHEMBL1560 WP554 1636
55 C0020538 CHEMBL1560 WP4756 1636
56 C0030567 CHEMBL1201236 WP2371 1644
57 C0376358 CHEMBL3183409 WP3981 367
58 C0020538 CHEMBL515606 WP554 1636
59 C0020538 CHEMBL515606 WP4756 1636
60 C0002395 CHEMBL502 WP2355 4790
61 C0030567 CHEMBL502 WP2355 4790
62 C0035579 CHEMBL1536 WP1531 7421
63 C0035579 CHEMBL2356023 WP1531 7421
64 C0035579 CHEMBL846 WP1531 7421
65 C0026769 CHEMBL25 WP673 4609
66 C0026769 CHEMBL25 WP673 595
67 C0026769 CHEMBL25 WP673 7157
68 C0026769 CHEMBL1434 WP673 5601
69 C0025202 CHEMBL1229517 WP4685 673
70 C0030567 CHEMBL1373 WP2371 6531
71 C0030567 CHEMBL1175 WP2371 6531
disease_id gene_id score drug_id
575 C0002395 1636 0.6 CHEMBL1581
This diff is collapsed.
drug_id disease_id gene_id score
0 CHEMBL1513 C0004238 185 0.03
1 CHEMBL1513 C0004238 185 0.03
2 CHEMBL1513 C0011881 185 0.1
3 CHEMBL1513 C0011881 185 0.1
4 CHEMBL1168 C0004238 1636 0.4
5 CHEMBL1168 C0004238 1636 0.4
6 CHEMBL191 C0011881 185 0.1
7 CHEMBL191 C0011881 185 0.1
8 CHEMBL1560 C0011881 1636 0.4
9 CHEMBL1560 C0011881 1636 0.4
10 CHEMBL1560 C0011881 4313 0.03
11 CHEMBL1560 C0011881 4313 0.03
12 CHEMBL1560 C0011881 4318 0.08
13 CHEMBL1560 C0011881 4318 0.08
14 CHEMBL59 C0020649 1812 0.3
15 CHEMBL59 C0020649 1813 0.3
16 CHEMBL1434 C0010674 836 0.02
17 CHEMBL1434 C0010674 3553 0.1
18 CHEMBL1434 C0011581 3553 0.4
19 CHEMBL1434 C0010674 4318 0.03
20 CHEMBL1434 C0010674 4843 0.1
21 CHEMBL1434 C0011581 4843 0.33
22 CHEMBL1434 C0010674 7422 0.03
23 CHEMBL1434 C0011581 7422 0.37
24 CHEMBL1393 C0020428 1585 0.4
25 CHEMBL1393 C0020428 1585 0.4
26 CHEMBL1393 C0020428 1586 0.13
27 CHEMBL1393 C0020428 1586 0.13
28 CHEMBL1393 C0020428 2908 0.2
29 CHEMBL1393 C0020428 2908 0.2
30 CHEMBL1393 C0020428 4306 0.12
31 CHEMBL1393 C0020428 4306 0.12
32 CHEMBL1017 C0020473 5468 0.05
33 CHEMBL1017 C0020473 5468 0.05
34 CHEMBL1042 C0004096 7421 0.1
35 CHEMBL1042 C0009324 7421 0.03
36 CHEMBL1042 C0010346 7421 0.08
37 CHEMBL1042 C0011849 7421 0.1
38 CHEMBL1042 C0020598 7421 0.16
39 CHEMBL1042 C0024141 7421 0.06
40 CHEMBL1042 C0026769 7421 0.4
41 CHEMBL1042 C0028754 7421 0.1
42 CHEMBL1042 C0029456 7421 0.7
43 CHEMBL1042 C0029458 7421 0.1
44 CHEMBL1040 C0020598 7421 0.16
45 CHEMBL1536 C0020598 7421 0.16
46 CHEMBL1536 C0029456 7421 0.7
47 CHEMBL1536 C0029458 7421 0.1
48 CHEMBL2356023 C0020598 7421 0.16
49 CHEMBL846 C0020598 7421 0.16
50 CHEMBL502 C0026769 590 0.3
51 CHEMBL502 C0002395 590 0.4
52 CHEMBL502 C0026769 3553 0.4
53 CHEMBL502 C0026769 3553 0.4
54 CHEMBL502 C0002395 3553 0.6
55 CHEMBL502 C0002395 3553 0.6
56 CHEMBL502 C0026769 4790 0.03
57 CHEMBL502 C0002395 4790 0.09
58 CHEMBL502 C0002395 43 0.4
59 CHEMBL502 C0002395 3356 0.1
60 CHEMBL502 C0002395 4842 0.06
61 CHEMBL502 C0002395 4842 0.06
62 CHEMBL25 C0003873 595 0.02
63 CHEMBL25 C0003873 595 0.02
64 CHEMBL25 C0003873 595 0.02
65 CHEMBL25 C0026764 595 0.7
66 CHEMBL25 C0026764 595 0.7
67 CHEMBL25 C0026764 595 0.7
68 CHEMBL25 C0003873 834 0.03
69 CHEMBL25 C0003873 834 0.03
70 CHEMBL25 C0003873 834 0.03
71 CHEMBL25 C0003873 834 0.03
72 CHEMBL25 C0003873 834 0.03
73 CHEMBL25 C0003873 834 0.03
74 CHEMBL25 C0004153 834 0.03
75 CHEMBL25 C0004153 834 0.03
76 CHEMBL25 C0004153 834 0.03
77 CHEMBL25 C0004153 834 0.03
78 CHEMBL25 C0004153 834 0.03
79 CHEMBL25 C0004153 834 0.03
80 CHEMBL25 C0003873 836 0.05
81 CHEMBL25 C0003873 836 0.05
82 CHEMBL25 C0003873 836 0.05
83 CHEMBL25 C0003873 836 0.05
84 CHEMBL25 C0003873 836 0.05
85 CHEMBL25 C0003873 836 0.05
86 CHEMBL25 C0004153 836 0.03
87 CHEMBL25 C0004153 836 0.03
88 CHEMBL25 C0004153 836 0.03
89 CHEMBL25 C0004153 836 0.03
90 CHEMBL25 C0004153 836 0.03
91 CHEMBL25 C0004153 836 0.03
92 CHEMBL25 C0026764 836 0.1
93 CHEMBL25 C0026764 836 0.1
94 CHEMBL25 C0026764 836 0.1
95 CHEMBL25 C0026764 836 0.1
96 CHEMBL25 C0026764 836 0.1
97 CHEMBL25 C0026764 836 0.1
98 CHEMBL25 C0003873 1909 0.03
99 CHEMBL25 C0003873 1909 0.03
100 CHEMBL25 C0003873 1909 0.03
101 CHEMBL25 C0004153 1909 0.03
102 CHEMBL25 C0004153 1909 0.03
103 CHEMBL25 C0004153 1909 0.03
104 CHEMBL25 C0026764 1909 0.02
105 CHEMBL25 C0026764 1909 0.02
106 CHEMBL25 C0026764 1909 0.02
107 CHEMBL25 C0003873 4792 0.02
108 CHEMBL25 C0003873 4792 0.02
109 CHEMBL25 C0003873 4792 0.02
110 CHEMBL25 C0026764 4792 0.04
111 CHEMBL25 C0026764 4792 0.04
112 CHEMBL25 C0026764 4792 0.04
113 CHEMBL25 C0003873 5111 0.03
114 CHEMBL25 C0003873 5111 0.03
115 CHEMBL25 C0003873 5111 0.03
116 CHEMBL25 C0026764 5111 0.08
117 CHEMBL25 C0026764 5111 0.08
118 CHEMBL25 C0026764 5111 0.08
119 CHEMBL25 C0003873 5742 0.34
120 CHEMBL25 C0003873 5742 0.34
121 CHEMBL25 C0003873 5742 0.34
122 CHEMBL25 C0004153 5742 0.02
123 CHEMBL25 C0004153 5742 0.02
124 CHEMBL25 C0004153 5742 0.02
125 CHEMBL25 C0026764 5742 0.02
126 CHEMBL25 C0026764 5742 0.02
127 CHEMBL25 C0026764 5742 0.02
128 CHEMBL25 C0003873 5743 0.4
129 CHEMBL25 C0003873 5743 0.4
130 CHEMBL25 C0003873 5743 0.4
131 CHEMBL25 C0004153 5743 0.4
132 CHEMBL25 C0004153 5743 0.4
133 CHEMBL25 C0004153 5743 0.4
134 CHEMBL25 C0026764 5743 0.06
135 CHEMBL25 C0026764 5743 0.06
136 CHEMBL25 C0026764 5743 0.06
137 CHEMBL25 C0003873 7157 0.1
138 CHEMBL25 C0003873 7157 0.1
139 CHEMBL25 C0003873 7157 0.1
140 CHEMBL25 C0003873 7157 0.1
141 CHEMBL25 C0003873 7157 0.1
142 CHEMBL25 C0003873 7157 0.1
143 CHEMBL25 C0004153 7157 0.1
144 CHEMBL25 C0004153 7157 0.1
145 CHEMBL25 C0004153 7157 0.1
146 CHEMBL25 C0004153 7157 0.1
147 CHEMBL25 C0004153 7157 0.1
148 CHEMBL25 C0004153 7157 0.1
149 CHEMBL25 C0004604 7157 0.1
150 CHEMBL25 C0004604 7157 0.1
151 CHEMBL25 C0004604 7157 0.1
152 CHEMBL25 C0004604 7157 0.1
153 CHEMBL25 C0004604 7157 0.1
154 CHEMBL25 C0004604 7157 0.1
155 CHEMBL25 C0026764 7157 0.2
156 CHEMBL25 C0026764 7157 0.2
157 CHEMBL25 C0026764 7157 0.2
158 CHEMBL25 C0026764 7157 0.2
159 CHEMBL25 C0026764 7157 0.2
160 CHEMBL25 C0026764 7157 0.2
161 CHEMBL25 C0004153 4609 0.02
162 CHEMBL25 C0004153 4609 0.02
163 CHEMBL25 C0004153 4609 0.02
164 CHEMBL25 C0026764 4609 0.4
165 CHEMBL25 C0026764 4609 0.4
166 CHEMBL25 C0026764 4609 0.4
167 CHEMBL25 C0026764 3309 0.03
168 CHEMBL25 C0026764 3309 0.03
169 CHEMBL25 C0026764 3309 0.03
170 CHEMBL25 C0026764 6197 0.02
171 CHEMBL25 C0026764 6197 0.02
172 CHEMBL25 C0026764 6197 0.02
173 CHEMBL1373 C0030193 6531 0.02
174 CHEMBL1175 C0016053 6532 0.04
This diff is collapsed.
drug_id disease_id gene_id score
0 CHEMBL59 C0242422 6531 0.2
1 CHEMBL1201203 C0242422 6531 0.2
2 CHEMBL1201236 C0242422 1644 0.31
3 CHEMBL1237 C0027051 5972 0.4
4 CHEMBL1237 C0027051 5972 0.4
5 CHEMBL1237 C0027051 5972 0.4
6 CHEMBL1237 C0027051 5972 0.4
7 CHEMBL25 C0027051 834 0.02
8 CHEMBL25 C0027051 834 0.02
9 CHEMBL25 C0027051 834 0.02
10 CHEMBL25 C0027051 834 0.02
11 CHEMBL25 C0027051 834 0.02
12 CHEMBL25 C0027051 834 0.02
13 CHEMBL25 C0029408 834 0.02
14 CHEMBL25 C0029408 834 0.02
15 CHEMBL25 C0029408 834 0.02
16 CHEMBL25 C0029408 834 0.02
17 CHEMBL25 C0029408 834 0.02
18 CHEMBL25 C0029408 834 0.02
19 CHEMBL25 C0027051 836 0.32
20 CHEMBL25 C0027051 836 0.32
21 CHEMBL25 C0027051 836 0.32
22 CHEMBL25 C0027051 836 0.32
23 CHEMBL25 C0027051 836 0.32
24 CHEMBL25 C0027051 836 0.32
25 CHEMBL25 C0029408 836 0.05
26 CHEMBL25 C0029408 836 0.05
27 CHEMBL25 C0029408 836 0.05
28 CHEMBL25 C0029408 836 0.05
29 CHEMBL25 C0029408 836 0.05
30 CHEMBL25 C0029408 836 0.05
31 CHEMBL25 C0027051 1909 0.21
32 CHEMBL25 C0027051 1909 0.21
33 CHEMBL25 C0027051 1909 0.21
34 CHEMBL25 C0030193 1909 0.04
35 CHEMBL25 C0030193 1909 0.04
36 CHEMBL25 C0030193 1909 0.04
37 CHEMBL25 C0149931 1909 0.34
38 CHEMBL25 C0149931 1909 0.34
39 CHEMBL25 C0149931 1909 0.34
40 CHEMBL25 C0027051 3309 0.3
41 CHEMBL25 C0027051 3309 0.3
42 CHEMBL25 C0027051 3309 0.3
43 CHEMBL25 C0029408 3309 0.02
44 CHEMBL25 C0029408 3309 0.02
45 CHEMBL25 C0029408 3309 0.02
46 CHEMBL25 C0027051 5742 0.02
47 CHEMBL25 C0027051 5742 0.02
48 CHEMBL25 C0027051 5742 0.02
49 CHEMBL25 C0030193 5742 0.03
50 CHEMBL25 C0030193 5742 0.03
51 CHEMBL25 C0030193 5742 0.03
52 CHEMBL25 C0027051 5743 0.1
53 CHEMBL25 C0027051 5743 0.1
54 CHEMBL25 C0027051 5743 0.1
55 CHEMBL25 C0029408 5743 0.1
56 CHEMBL25 C0029408 5743 0.1
57 CHEMBL25 C0029408 5743 0.1
58 CHEMBL25 C0030193 5743 0.1
59 CHEMBL25 C0030193 5743 0.1
60 CHEMBL25 C0030193 5743 0.1
61 CHEMBL25 C0948089 5743 0.02
62 CHEMBL25 C0948089 5743 0.02
63 CHEMBL25 C0948089 5743 0.02
64 CHEMBL25 C0027051 7157 0.33
65 CHEMBL25 C0027051 7157 0.33
66 CHEMBL25 C0027051 7157 0.33
67 CHEMBL25 C0027051 7157 0.33
68 CHEMBL25 C0027051 7157 0.33
69 CHEMBL25 C0027051 7157 0.33
70 CHEMBL25 C0029408 7157 0.03
71 CHEMBL25 C0029408 7157 0.03
72 CHEMBL25 C0029408 7157 0.03
73 CHEMBL25 C0029408 7157 0.03
74 CHEMBL25 C0029408 7157 0.03
75 CHEMBL25 C0029408 7157 0.03
76 CHEMBL25 C0030193 7157 0.1
77 CHEMBL25 C0030193 7157 0.1
78 CHEMBL25 C0030193 7157 0.1
79 CHEMBL25 C0030193 7157 0.1
80 CHEMBL25 C0030193 7157 0.1
81 CHEMBL25 C0030193 7157 0.1
82 CHEMBL25 C0032463 7157 0.03
83 CHEMBL25 C0032463 7157 0.03
84 CHEMBL25 C0032463 7157 0.03
85 CHEMBL25 C0032463 7157 0.03
86 CHEMBL25 C0032463 7157 0.03
87 CHEMBL25 C0032463 7157 0.03
88 CHEMBL1581 C0028754 1636 0.3
89 CHEMBL1581 C0028754 1636 0.3
90 CHEMBL1168 C0038454 1636 0.4
91 CHEMBL1168 C0038454 1636 0.4
92 CHEMBL1017 C0038454 185 0.08
93 CHEMBL1017 C0038454 185 0.08
94 CHEMBL1017 C0038454 5468 0.04
95 CHEMBL1017 C0038454 5468 0.04
96 CHEMBL1042 C0036341 7421 0.03
97 CHEMBL1042 C0042870 7421 0.4
98 CHEMBL502 C0036341 43 0.3
99 CHEMBL502 C0036341 43 0.3
100 CHEMBL502 C0030567 43 0.03
101 CHEMBL502 C0497327 43 0.05
102 CHEMBL502 C0497327 43 0.05
103 CHEMBL502 C0036341 3356 0.4
104 CHEMBL502 C0036341 3356 0.4
105 CHEMBL502 C0036341 3553 0.4
106 CHEMBL502 C0036341 3553 0.4
107 CHEMBL502 C0036341 3553 0.4
108 CHEMBL502 C0036341 3553 0.4
109 CHEMBL502 C0026769 3553 0.4
110 CHEMBL502 C0026769 3553 0.4
111 CHEMBL502 C0030567 3553 0.28
112 CHEMBL502 C0030567 3553 0.28
113 CHEMBL502 C0497327 3553 0.09
114 CHEMBL502 C0497327 3553 0.09
115 CHEMBL502 C0497327 3553 0.09
116 CHEMBL502 C0497327 3553 0.09
117 CHEMBL502 C0036341 4790 0.1
118 CHEMBL502 C0036341 4790 0.1
119 CHEMBL502 C0026769 4790 0.03
120 CHEMBL502 C0030567 4790 0.03
121 CHEMBL502 C0036341 4842 0.5
122 CHEMBL502 C0036341 4842 0.5
123 CHEMBL502 C0036341 4842 0.5
124 CHEMBL502 C0036341 4842 0.5
125 CHEMBL502 C0030567 4842 0.38
126 CHEMBL502 C0030567 4842 0.38
127 CHEMBL1373 C0036341 6531 0.4
128 CHEMBL1373 C1269683 6531 0.04
129 CHEMBL1175 C0036341 6530 0.34
130 CHEMBL1175 C1269683 6530 0.4
131 CHEMBL1175 C0036341 6531 0.4
132 CHEMBL1175 C0497327 6531 0.33
133 CHEMBL1175 C1269683 6531 0.04
134 CHEMBL1175 C0036341 6532 0.4
135 CHEMBL1175 C0497327 6532 0.02
136 CHEMBL1175 C1269683 6532 0.6
137 CHEMBL1536 C0042870 7421 0.4
138 CHEMBL846 C0042870 7421 0.4
139 CHEMBL846 C0085682 7421 0.13
140 CHEMBL2110563 C0042847 4548 0.22
141 CHEMBL2103737 C0042847 4548 0.22
142 CHEMBL1040 C0085682 7421 0.13
143 CHEMBL2356023 C0085682 7421 0.13
144 CHEMBL502 C0026769 590 0.3
145 CHEMBL502 C0030567 590 0.02
146 CHEMBL502 C0497327 590 0.06
147 CHEMBL502 C0497327 590 0.06
148 CHEMBL25 C0029408 1645 0.3
149 CHEMBL25 C0029408 1645 0.3
150 CHEMBL25 C0029408 1645 0.3
151 CHEMBL25 C0030193 4792 0.02
152 CHEMBL25 C0030193 4792 0.02
153 CHEMBL25 C0030193 4792 0.02
154 CHEMBL1434 C0031099 4318 0.35
155 CHEMBL1434 C0032285 4318 0.06
156 CHEMBL1434 C0031099 7422 0.02
157 CHEMBL1434 C0032285 7422 0.04
158 CHEMBL1434 C0042029 7422 0.03
159 CHEMBL1434 C0032285 240 0.02
160 CHEMBL1434 C0032285 834 0.32
161 CHEMBL1434 C0032285 3553 0.6
162 CHEMBL1434 C0042029 4843 0.3
163 CHEMBL1229517 C0026764 673 0.49
disease_PwB;drug_id;disease_no_PwB
C0020538;CHEMBL578;C0018802
C0020538;CHEMBL578;C0018802
C0020538;CHEMBL577;C0018802
C0020538;CHEMBL577;C0018802
C0020538;CHEMBL1513;C0004238
C0020538;CHEMBL1513;C0011881
C0020538;CHEMBL1513;C0004238
C0020538;CHEMBL1513;C0011881
C0030567;CHEMBL59;C0001206
C0030567;CHEMBL59;C0020649
C0030567;CHEMBL59;C0024586
C0020538;CHEMBL191;C0010674
C0020538;CHEMBL191;C0011881
C0020538;CHEMBL191;C0010674
C0020538;CHEMBL191;C0011881
C0020538;CHEMBL1168;C0004238
C0020538;CHEMBL1168;C0004238
C0020538;CHEMBL1393;C0003962
C0020538;CHEMBL1393;C0013604
C0020538;CHEMBL1393;C0020428
C0020538;CHEMBL1393;C0003962
C0020538;CHEMBL1393;C0013604
C0020538;CHEMBL1393;C0020428
C0020538;CHEMBL1017;C0020473
C0020538;CHEMBL1017;C0020473
C0020538;CHEMBL1069;C0018801
C0020538;CHEMBL1069;C0018801
C0035579;CHEMBL1042;C0004096
C0035579;CHEMBL1042;C0009324
C0035579;CHEMBL1042;C0010346
C0035579;CHEMBL1042;C0011849
C0035579;CHEMBL1042;C0020598
C0035579;CHEMBL1042;C0020626
C0035579;CHEMBL1042;C0024141
C0035579;CHEMBL1042;C0026769
C0035579;CHEMBL1042;C0028754
C0035579;CHEMBL1042;C0029456
C0035579;CHEMBL1042;C0029458
C0030567;CHEMBL1201203;C0015371
C0035579;CHEMBL1040;C0020598
C0035579;CHEMBL1040;C0035086
C0020538;CHEMBL1560;C0011881
C0020538;CHEMBL1560;C0011881
C0030567;CHEMBL502;C0002395
C0030567;CHEMBL502;C0026769
C0035579;CHEMBL1536;C0020598
C0035579;CHEMBL1536;C0020626
C0035579;CHEMBL1536;C0029456
C0035579;CHEMBL1536;C0029458
C0035579;CHEMBL2356023;C0020598
C0035579;CHEMBL2356023;C0020626
C0035579;CHEMBL846;C0020598
C0035579;CHEMBL846;C0020626
C0035579;CHEMBL846;C0035086
C0026769;CHEMBL25;C0003862
C0026769;CHEMBL25;C0003873
C0026769;CHEMBL25;C0004153
C0026769;CHEMBL25;C0004604
C0026769;CHEMBL25;C0009443
C0026769;CHEMBL25;C0026764
C0026769;CHEMBL25;C0003862
C0026769;CHEMBL25;C0003873
C0026769;CHEMBL25;C0004153
C0026769;CHEMBL25;C0004604
C0026769;CHEMBL25;C0009443
C0026769;CHEMBL25;C0026764
C0026769;CHEMBL25;C0003862
C0026769;CHEMBL25;C0003873
C0026769;CHEMBL25;C0004153
C0026769;CHEMBL25;C0004604
C0026769;CHEMBL25;C0009443
C0026769;CHEMBL25;C0026764
C0026769;CHEMBL1434;C0001144
C0026769;CHEMBL1434;C0001261
C0026769;CHEMBL1434;C0003175
C0026769;CHEMBL1434;C0006277
C0026769;CHEMBL1434;C0006309
C0026769;CHEMBL1434;C0010674
C0026769;CHEMBL1434;C0011581
C0026769;CHEMBL1434;C0018081
C0026769;CHEMBL1434;C0023860
C0030567;CHEMBL1373;C0027404
C0030567;CHEMBL1373;C0030193
C0030567;CHEMBL1175;C0016053
C0030567;CHEMBL59;C0184567
C0030567;CHEMBL59;C0242422
C0030567;CHEMBL59;C0600177
C0030567;CHEMBL59;C1621958
C0020538;CHEMBL1237;C0027051
C0020538;CHEMBL1237;C0027051
C0020538;CHEMBL1237;C0027051
C0020538;CHEMBL1237;C0027051
C0020538;CHEMBL1581;C0028754
C0020538;CHEMBL1581;C0028754
C0020538;CHEMBL1168;C0038454
C0020538;CHEMBL1168;C0038454
C0020538;CHEMBL1017;C0038454
C0020538;CHEMBL1017;C0038454
C0035579;CHEMBL1042;C0036337
C0035579;CHEMBL1042;C0036341
C0035579;CHEMBL1042;C0042870
C0030567;CHEMBL1201203;C0242422
C0020538;CHEMBL611;C1739363
C0002892;CHEMBL2110563;C0042847
C0002892;CHEMBL2110563;C0162316
C0002892;CHEMBL2103737;C0006114
C0002892;CHEMBL2103737;C0033860
C0002892;CHEMBL2103737;C0042847
C0035579;CHEMBL1040;C0085682
C0030567;CHEMBL1201236;C0184567
C0030567;CHEMBL1201236;C0242422
C0002395;CHEMBL502;C0026769
C0002395;CHEMBL502;C0030567
C0002395;CHEMBL502;C0036341
C0002395;CHEMBL502;C0497327
C0030567;CHEMBL502;C0036341
C0030567;CHEMBL502;C0497327
C0035579;CHEMBL1536;C0042870
C0035579;CHEMBL1536;C3536984
C0035579;CHEMBL2356023;C0039621
C0035579;CHEMBL2356023;C0085682
C0035579;CHEMBL846;C0042870
C0035579;CHEMBL846;C0085682
C0035579;CHEMBL846;C1527383
C0026769;CHEMBL25;C0027051
C0026769;CHEMBL25;C0029408
C0026769;CHEMBL25;C0030193
C0026769;CHEMBL25;C0032463
C0026769;CHEMBL25;C0040460
C0026769;CHEMBL25;C0149931
C0026769;CHEMBL25;C0393735
C0026769;CHEMBL25;C0948089
C0026769;CHEMBL25;C0027051
C0026769;CHEMBL25;C0029408
C0026769;CHEMBL25;C0030193
C0026769;CHEMBL25;C0032463
C0026769;CHEMBL25;C0040460
C0026769;CHEMBL25;C0149931
C0026769;CHEMBL25;C0393735
C0026769;CHEMBL25;C0948089
C0026769;CHEMBL25;C0027051
C0026769;CHEMBL25;C0029408
C0026769;CHEMBL25;C0030193
C0026769;CHEMBL25;C0032463
C0026769;CHEMBL25;C0040460
C0026769;CHEMBL25;C0149931
C0026769;CHEMBL25;C0393735
C0026769;CHEMBL25;C0948089
C0026769;CHEMBL1434;C0031099
C0026769;CHEMBL1434;C0031350
C0026769;CHEMBL1434;C0032064
C0026769;CHEMBL1434;C0032285
C0026769;CHEMBL1434;C0034362
C0026769;CHEMBL1434;C0035854
C0026769;CHEMBL1434;C0037199
C0026769;CHEMBL1434;C0039128
C0026769;CHEMBL1434;C0042029
C0025202;CHEMBL1229517;C0026764
C0030567;CHEMBL1373;C0036341
C0030567;CHEMBL1373;C1269683
C0030567;CHEMBL1175;C0036341
C0030567;CHEMBL1175;C0497327
C0030567;CHEMBL1175;C1269683
C0020538;CHEMBL1581;C0002395
{
"cells": [
{
"cell_type": "code",
"execution_count": 3,
"id": "4d974299",
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import numpy as np\n",
"import seaborn as sns\n",
"import matplotlib.pyplot as plt\n",
"from sqlalchemy import create_engine\n",
"import mysql.connector"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7d7450e2",
"metadata": {},
"outputs": [],
"source": [
"triples_repodb = pd.read_csv(\"./Data/Input/Drug Repurposing/triples_filter_repodb_final.tsv\", sep='\\,')\n",
"triples_repodb = triples_repodb.drop(columns=['Unnamed: 0'])"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "8f4f9844",
"metadata": {},
"outputs": [],
"source": [
"drug_gen = pd.read_csv('./Data/Input/DISNET/drug_gen.tsv', sep='\\t')\n",
"drug_gen = drug_gen.drop([\"Unnamed: 0\"],axis=1)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "899450d6",
"metadata": {},
"outputs": [],
"source": [
"tri_gen_target= triples_repodb.merge(drug_gen, left_on = \"drug\",right_on = \"drug_id\",how= \"inner\")"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "b1edb7eb",
"metadata": {},
"outputs": [],
"source": [
"tri_gen_target = tri_gen_target.drop([\"drug\"],axis=1)"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "d17dd8b4",
"metadata": {},
"outputs": [],
"source": [
"tri_gen_target = tri_gen_target.rename(columns={\"disease1\": \"disease_id\"})"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "2c5af632",
"metadata": {},
"outputs": [],
"source": [
"tri_gen_target = tri_gen_target.rename(columns={\"disease2\": \"disease_id_new\"})"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "f8f0a4e7",
"metadata": {},
"outputs": [],
"source": [
"dis_gen = pd.read_csv('./Data/Input/DISNET/dis_gen.tsv', sep='\\t')\n",
"dis_gen = dis_gen.drop([\"Unnamed: 0\"],axis=1)"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "31a63d57",
"metadata": {},
"outputs": [],
"source": [
"dis_gen_new = dis_gen.rename(columns={\"disease_id\": \"disease2\"})"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "9db0ad6c",
"metadata": {},
"outputs": [],
"source": [
"disease_gen_one = tri_gen_target.merge(dis_gen, on = [\"gene_id\",\"disease_id\"],how=\"inner\")"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "0cdd6380",
"metadata": {},
"outputs": [],
"source": [
"disease_gen_one = disease_gen_one.merge(dis_gen_new, on = [\"gene_id\",\"disease2\"],how=\"inner\")"
]
},
{
"cell_type": "code",
"execution_count": 53,
"id": "3e3fb7f9",
"metadata": {},
"outputs": [],
"source": [
"triples_final_drege = disease_gen_one.drop([\"score_x\",\"score_y\"],axis=1)"
]
},
{
"cell_type": "code",
"execution_count": 54,
"id": "12980000",
"metadata": {},
"outputs": [],
"source": [
"triples_final_drege = triples_final_drege.drop_duplicates()"
]
},
{
"cell_type": "code",
"execution_count": 55,
"id": "95d6a637",
"metadata": {},
"outputs": [],
"source": [
"triples_final_drege = triples_final_drege.drop([\"gene_id\"],axis=1)"
]
},
{
"cell_type": "code",
"execution_count": 56,
"id": "55455401",
"metadata": {},
"outputs": [],
"source": [
"triples_final_drege = triples_final_drege.drop_duplicates()"
]
},
{
"cell_type": "markdown",
"id": "6e7cd640",
"metadata": {},
"source": [
"# CSBJ"
]
},
{
"cell_type": "code",
"execution_count": 37,
"id": "32cdd921",
"metadata": {},
"outputs": [],
"source": [
"triplets_csbj = pd.read_excel(\"./Data/Input/DISNET/triplets_chembl_disnet.xlsx\",engine='openpyxl')\n",
"triplets_csbj =triplets_csbj.drop([\"Unnamed: 0\"],axis=1)"
]
},
{
"cell_type": "code",
"execution_count": 40,
"id": "868f7e2c",
"metadata": {},
"outputs": [],
"source": [
"tri_gen_target_csbj= triplets_csbj.merge(drug_gen, on = \"drug_id\",how= \"inner\")"
]
},
{
"cell_type": "code",
"execution_count": 42,
"id": "c6bad999",
"metadata": {},
"outputs": [],
"source": [
"tri_gen_target_csbj = tri_gen_target_csbj.rename(columns={\"Original Condition CUI\": \"disease_id\"})"
]
},
{
"cell_type": "code",
"execution_count": 43,
"id": "d86d80dd",
"metadata": {},
"outputs": [],
"source": [
"disease_gen_one_csbj = tri_gen_target_csbj.merge(dis_gen, on = [\"gene_id\",\"disease_id\"],how=\"inner\")"
]
},
{
"cell_type": "code",
"execution_count": 45,
"id": "54fed905",
"metadata": {},
"outputs": [],
"source": [
"dis_gen_new = dis_gen.rename(columns={\"disease_id\": \"New Condition CUI\"})"
]
},
{
"cell_type": "code",
"execution_count": 48,
"id": "174647b5",
"metadata": {},
"outputs": [],
"source": [
"disease_gen_one_csbj_twounion = disease_gen_one_csbj.merge(dis_gen_new, on = [\"gene_id\",\"New Condition CUI\"],how=\"inner\")"
]
},
{
"cell_type": "code",
"execution_count": 50,
"id": "b8ee708d",
"metadata": {},
"outputs": [],
"source": [
"triplets_csbj_drege = disease_gen_one_csbj_twounion[[\"disease_id\",\"New Condition CUI\",\"drug_id\"]]"
]
},
{
"cell_type": "code",
"execution_count": 65,
"id": "8453abae",
"metadata": {},
"outputs": [],
"source": [
"triplets_csbj_drege =triplets_csbj_drege.drop_duplicates()"
]
},
{
"cell_type": "markdown",
"id": "ad972954",
"metadata": {},
"source": [
"# FINAL TRIPLES DREGE"
]
},
{
"cell_type": "code",
"execution_count": 61,
"id": "aad768b8",
"metadata": {},
"outputs": [],
"source": [
"triples_final_drege = triples_final_drege.rename(columns={\"disease2\": \"New Condition CUI\"})"
]
},
{
"cell_type": "code",
"execution_count": 67,
"id": "e6837711",
"metadata": {},
"outputs": [],
"source": [
"triples_final_drege_repo_csbj = pd.concat([triplets_csbj_drege,triples_final_drege])"
]
},
{
"cell_type": "code",
"execution_count": 70,
"id": "3e675cd1",
"metadata": {},
"outputs": [],
"source": [
"triples_final_drege_repo_csbj = triples_final_drege_repo_csbj.drop_duplicates()"
]
},
{
"cell_type": "code",
"execution_count": 72,
"id": "75d2b323",
"metadata": {},
"outputs": [],
"source": [
"triples_final_drege_repo_csbj.to_excel(\"triples_final_drege_repo_csbj.xlsx\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
......@@ -22,7 +22,7 @@
"metadata": {},
"outputs": [],
"source": [
"triples_repodb = pd.read_csv(\"triples_filter_repodb_final.tsv\", sep='\\,')\n",
"triples_repodb = pd.read_csv(\"./Data/Input/Drug Repurposing/triples_filter_repodb_final.tsv\", sep='\\,')\n",
"triples_repodb = triples_repodb.drop(columns=['Unnamed: 0'])"
]
},
......@@ -33,7 +33,7 @@
"metadata": {},
"outputs": [],
"source": [
"dis_path_direct = pd.read_csv('disease_pathway.tsv', sep='\\t')"
"dis_path_direct = pd.read_csv('./Data/Input/DISNET/disease_pathway.tsv', sep='\\t')"
]
},
{
......@@ -63,7 +63,7 @@
"metadata": {},
"outputs": [],
"source": [
"drug_gen = pd.read_csv('drug_gen.tsv', sep='\\t')\n",
"drug_gen = pd.read_csv('./Data/Input/DISNET/drug_gen.tsv', sep='\\t')\n",
"drug_gen = drug_gen.drop([\"Unnamed: 0\"],axis=1)"
]
},
......@@ -114,7 +114,7 @@
"metadata": {},
"outputs": [],
"source": [
"dis_gen = pd.read_csv('dis_gen.tsv', sep='\\t')\n",
"dis_gen = pd.read_csv('./Data/Input/DISNET/dis_gen.tsv', sep='\\t')\n",
"dis_gen = dis_gen.drop([\"Unnamed: 0\"],axis=1)"
]
},
......@@ -185,7 +185,7 @@
"metadata": {},
"outputs": [],
"source": [
"drug_gen_pw = pd.read_csv(\"drug_gen_pw.tsv\", sep='\\t')\n",
"drug_gen_pw = pd.read_csv(\"./Data/Input/DISNET/drug_gen_pw.tsv\", sep='\\t')\n",
"drug_gen_pw = drug_gen_pw.drop([\"Unnamed: 0\"],axis=1)"
]
},
......@@ -325,7 +325,7 @@
"metadata": {},
"outputs": [],
"source": [
"triplets_csbj = pd.read_excel(\"triplets_chembl_disnet.xlsx\",engine='openpyxl')\n",
"triplets_csbj = pd.read_excel(\"./Data/Input/DISNET/triplets_chembl_disnet.xlsx\",engine='openpyxl')\n",
"triplets_csbj =triplets_csbj.drop([\"Unnamed: 0\"],axis=1)"
]
},
......
......@@ -22,7 +22,7 @@
"metadata": {},
"outputs": [],
"source": [
"triples_repodb = pd.read_csv(\"triples_filter_repodb_final.tsv\", sep='\\,')\n",
"triples_repodb = pd.read_csv(\"./Data/Input/Drug Repurposing/triples_filter_repodb_final.tsv\", sep='\\,')\n",
"triples_repodb = triples_repodb.drop(columns=['Unnamed: 0'])"
]
},
......@@ -33,7 +33,7 @@
"metadata": {},
"outputs": [],
"source": [
"drug_gen = pd.read_csv('drug_gen.tsv', sep='\\t')\n",
"drug_gen = pd.read_csv('./Data/Input/DISNET/drug_gen.tsv', sep='\\t')\n",
"drug_gen = drug_gen.drop([\"Unnamed: 0\"],axis=1)"
]
},
......@@ -84,7 +84,7 @@
"metadata": {},
"outputs": [],
"source": [
"dis_gen = pd.read_csv('dis_gen.tsv', sep='\\t')\n",
"dis_gen = pd.read_csv('./Data/Input/DISNET/dis_gen.tsv', sep='\\t')\n",
"dis_gen = dis_gen.drop([\"Unnamed: 0\"],axis=1)"
]
},
......@@ -173,7 +173,7 @@
"metadata": {},
"outputs": [],
"source": [
"triplets_csbj = pd.read_excel(\"triplets_chembl_disnet.xlsx\",engine='openpyxl')\n",
"triplets_csbj = pd.read_excel(\"./Data/Input/DISNET/triplets_chembl_disnet.xlsx\",engine='openpyxl')\n",
"triplets_csbj =triplets_csbj.drop([\"Unnamed: 0\"],axis=1)"
]
},
......
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import numpy as np\n",
"import seaborn as sns\n",
"import matplotlib.pyplot as plt\n",
"from sqlalchemy import create_engine\n",
"from sklearn import preprocessing\n",
"import mysql.connector\n",
"from pandas import DataFrame"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# data"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"triplets_total = pd.read_excel('./Data/Input/DISNET/triples_drebiop_final_dos.xlsx',engine='openpyxl')\n",
"triplets_total = triplets_total.drop(columns=['Unnamed: 0'])"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"triplets_total = triplets_total.drop_duplicates()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"disease_one = triplets_total[[\"Original Condition CUI\"]]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"disease_one = disease_one.rename(columns={\"Original Condition CUI\": \"disease_id\"})"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# genes"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"genes = pd.read_excel((\"./Data/Input/DISNET/dis_gen.xlsx\"),engine='openpyxl')\n",
"genes = gen.drop([\"Unnamed: 0\"],axis=1)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"dis_gene = disease_one.merge(genes, how= \"inner\", on= \"disease_id\")"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"dis_gen_num_gen = dis_gene.groupby(['disease_id']).count()"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"dis_gen_num_gen = dis_gen_num_gen.reset_index()"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"count 23.000000\n",
"mean 105.130435\n",
"std 149.308194\n",
"min 2.000000\n",
"25% 17.500000\n",
"50% 37.000000\n",
"75% 91.500000\n",
"max 526.000000\n",
"Name: gene_id, dtype: float64"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dis_gen_num_gen[\"gene_id\"].describe()"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [],
"source": [
"#### pw via gene"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"pw_gene = pd.read_csv('./Data/Input/DISNET/pathways_genes.tsv', sep='\\t')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"dis_pw_gen = disease_one.merge(pw_gene, how= \"inner\", on= \"disease_id\")"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [],
"source": [
"dis_pw_gen = dis_pw_gen.groupby(['disease_id']).count()"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
"dis_pw_gen = dis_pw_gen.reset_index()"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"391.04347826086956"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dis_pw_gen[\"pathway_id\"].mean()"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [],
"source": [
"#### pw direct"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [],
"source": [
"dis_path_direct = pd.read_csv('./Data/Input/DISNET/disease_pathway.tsv', sep='\\t')"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [],
"source": [
"dr_pw = disease_one.merge(dis_path_direct,on=\"disease_id\",how= \"inner\")"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [],
"source": [
"dr_pw = dr_pw.groupby(['disease_id']).count()"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [],
"source": [
"dr_pw = dr_pw.reset_index()"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"nan"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dr_pw[\"pathway_id\"].mean()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"### drug"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"drug = pd.read_csv('./Data/Input/DISNET/drugs.tsv', sep='\\t')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"dis_drug = disease_one.merge(drug, how= \"inner\", on= \"disease_id\")"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [],
"source": [
"dis_drug = dis_drug.groupby(['disease_id']).count()"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [],
"source": [
"dis_drug = dis_drug.reset_index()"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"384.6"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dis_drug[\"drug_id\"].mean()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"### symptom"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sint = pd.read_csv('./Data/Input/DISNET/sint_all.tsv', sep='\\t')\n",
"sint = sint.drop([\"Unnamed: 0\"],axis=1)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"dis_sint = disease_one.merge(sint, how= \"inner\", on= \"disease_id\")"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [],
"source": [
"dis_sint = dis_sint.groupby(['disease_id']).count()"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {},
"outputs": [],
"source": [
"dis_sint = dis_sint.reset_index()"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"68.6086956521739"
]
},
"execution_count": 40,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dis_sint[\"symptom\"].mean()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.13"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
......@@ -20,7 +20,7 @@
"metadata": {},
"outputs": [],
"source": [
"triplets_total_drebiop = pd.read_excel('triples_drebiop_final_dos.xlsx',engine='openpyxl')\n",
"triplets_total_drebiop = pd.read_excel('./Data/Input/DISNET/triples_drebiop_final_dos.xlsx',engine='openpyxl')\n",
"triplets_total_drebiop = triplets_total_drebiop.drop(columns=['Unnamed: 0'])"
]
},
......@@ -68,7 +68,7 @@
"metadata": {},
"outputs": [],
"source": [
"triplets_total_drege = pd.read_excel((\"triples_final_drege_repo_csbj.xlsx\"),engine='openpyxl')\n",
"triplets_total_drege = pd.read_excel((\"./Data/Input/DISNET/triples_final_drege_repo_csbj.xlsx\"),engine='openpyxl')\n",
"triplets_total_drege = triplets_total_drege.drop(columns=['Unnamed: 0'])"
]
},
......@@ -96,7 +96,7 @@
"metadata": {},
"outputs": [],
"source": [
"type_drug =pd.read_excel(\"Drug_Categories.xlsx\", engine='openpyxl')"
"type_drug =pd.read_excel(\"./Data/Input/DISNET/Drug_Categories.xlsx\", engine='openpyxl')"
]
},
{
......
......@@ -20,7 +20,7 @@
"metadata": {},
"outputs": [],
"source": [
"triplets_total_drege = pd.read_excel((\"triples_final_drege_repo_csbj.xlsx\"),engine='openpyxl')\n",
"triplets_total_drege = pd.read_excel((\"./Data/Input/DISNET/triples_final_drege_repo_csbj.xlsx\"),engine='openpyxl')\n",
"triplets_total_drege = triplets_total_drege.drop(columns=['Unnamed: 0'])"
]
},
......@@ -48,7 +48,7 @@
"metadata": {},
"outputs": [],
"source": [
"triplets_total_drebiop = pd.read_excel('triples_drebiop_final_dos.xlsx',engine='openpyxl')\n",
"triplets_total_drebiop = pd.read_excel('./Data/Input/DISNET/triples_drebiop_final_dos.xlsx',engine='openpyxl')\n",
"triplets_total_drebiop = triplets_total_drebiop.drop(columns=['Unnamed: 0'])"
]
},
......@@ -85,7 +85,7 @@
"metadata": {},
"outputs": [],
"source": [
"drug_atc = pd.read_csv('drug_atc.tsv', sep='\\t')\n",
"drug_atc = pd.read_csv('./Data/Input/DISNET/drug_atc.tsv', sep='\\t')\n",
"drug_atc = drug_atc.drop([\"Unnamed: 0\"],axis=1)"
]
},
......@@ -246,7 +246,7 @@
"metadata": {},
"outputs": [],
"source": [
"atc_name = pd.read_excel(\"ATC_desc_name.xlsx\")\n",
"atc_name = pd.read_excel(\"./Data/Input/DISNET/ATC_desc_name.xlsx\")\n",
"atc_name['index'] = atc_name['index'].str.strip()"
]
},
......
......@@ -69,7 +69,7 @@
"metadata": {},
"outputs": [],
"source": [
"sint = pd.read_csv('sint_all.tsv', sep='\\t')\n",
"sint = pd.read_csv('./Data/Input/DISNET/sint_all.tsv', sep='\\t')\n",
"sint = sint.drop([\"Unnamed: 0\"],axis=1)"
]
},
......@@ -275,7 +275,7 @@
"metadata": {},
"outputs": [],
"source": [
"triplets_total = pd.read_excel('triples_drebiop_final_dos.xlsx',engine='openpyxl')\n",
"triplets_total = pd.read_excel('./Data/Input/DISNET/triples_drebiop_final_dos.xlsx',engine='openpyxl')\n",
"triplets_total = triplets_total.drop(columns=['Unnamed: 0'])\n",
"triplets_total = triplets_total.drop_duplicates()"
]
......@@ -475,7 +475,7 @@
"metadata": {},
"outputs": [],
"source": [
"Triples_target_final = pd.read_excel(\"triples_final_drege_repo_csbj.xlsx\",engine='openpyxl')\n",
"Triples_target_final = pd.read_excel(\"./Data/Input/DISNET/triples_final_drege_repo_csbj.xlsx\",engine='openpyxl')\n",
"Triples_target_final = Triples_target_final.drop(columns=['Unnamed: 0'])\n",
"Triples_target_final = Triples_target_final.drop_duplicates()"
]
......
This diff is collapsed.
This diff is collapsed.
......@@ -20,7 +20,7 @@
"metadata": {},
"outputs": [],
"source": [
"triplets_total_drebiop = pd.read_excel('triples_drebiop_final_dos.xlsx',engine='openpyxl')\n",
"triplets_total_drebiop = pd.read_excel('./Data/Input/DISNET/triples_drebiop_final_dos.xlsx',engine='openpyxl')\n",
"triplets_total_drebiop = triplets_total_drebiop.drop(columns=['Unnamed: 0'])"
]
},
......@@ -68,7 +68,7 @@
"metadata": {},
"outputs": [],
"source": [
"triplets_total_drege = pd.read_excel((\"triples_final_drege_repo_csbj.xlsx\"),engine='openpyxl')\n",
"triplets_total_drege = pd.read_excel((\"./Data/Input/DISNET/triples_final_drege_repo_csbj.xlsx\"),engine='openpyxl')\n",
"triplets_total_drege = triplets_total_drege.drop(columns=['Unnamed: 0'])"
]
},
......@@ -96,7 +96,7 @@
"metadata": {},
"outputs": [],
"source": [
"type_drug =pd.read_excel(\"Drug_Categories.xlsx\", engine='openpyxl')"
"type_drug =pd.read_excel(\"./Data/Input/DISNET/Drug_Categories.xlsx\", engine='openpyxl')"
]
},
{
......
......@@ -20,7 +20,7 @@
"metadata": {},
"outputs": [],
"source": [
"triplets_total_drege = pd.read_excel((\"triples_final_drege_repo_csbj.xlsx\"),engine='openpyxl')\n",
"triplets_total_drege = pd.read_excel((\"./Data/Input/DISNET/triples_final_drege_repo_csbj.xlsx\"),engine='openpyxl')\n",
"triplets_total_drege = triplets_total_drege.drop(columns=['Unnamed: 0'])"
]
},
......@@ -48,7 +48,7 @@
"metadata": {},
"outputs": [],
"source": [
"triplets_total_drebiop = pd.read_excel('triples_drebiop_final_dos.xlsx',engine='openpyxl')\n",
"triplets_total_drebiop = pd.read_excel('./Data/Input/DISNET/triples_drebiop_final_dos.xlsx',engine='openpyxl')\n",
"triplets_total_drebiop = triplets_total_drebiop.drop(columns=['Unnamed: 0'])"
]
},
......@@ -85,7 +85,7 @@
"metadata": {},
"outputs": [],
"source": [
"drug_atc = pd.read_csv('drug_atc.tsv', sep='\\t')\n",
"drug_atc = pd.read_csv('./Data/Input/DISNET/drug_atc.tsv', sep='\\t')\n",
"drug_atc = drug_atc.drop([\"Unnamed: 0\"],axis=1)"
]
},
......@@ -246,7 +246,7 @@
"metadata": {},
"outputs": [],
"source": [
"atc_name = pd.read_excel(\"ATC_desc_name.xlsx\")\n",
"atc_name = pd.read_excel(\"./Data/Input/DISNET/ATC_desc_name.xlsx\")\n",
"atc_name['index'] = atc_name['index'].str.strip()"
]
},
......
......@@ -69,7 +69,7 @@
"metadata": {},
"outputs": [],
"source": [
"sint = pd.read_csv('sint_all.tsv', sep='\\t')\n",
"sint = pd.read_csv('./Data/Input/DISNET/sint_all.tsv', sep='\\t')\n",
"sint = sint.drop([\"Unnamed: 0\"],axis=1)"
]
},
......@@ -275,7 +275,7 @@
"metadata": {},
"outputs": [],
"source": [
"triplets_total = pd.read_excel('triples_drebiop_final_dos.xlsx',engine='openpyxl')\n",
"triplets_total = pd.read_excel('./Data/Input/DISNET/triples_drebiop_final_dos.xlsx',engine='openpyxl')\n",
"triplets_total = triplets_total.drop(columns=['Unnamed: 0'])\n",
"triplets_total = triplets_total.drop_duplicates()"
]
......@@ -475,7 +475,7 @@
"metadata": {},
"outputs": [],
"source": [
"Triples_target_final = pd.read_excel(\"triples_final_drege_repo_csbj.xlsx\",engine='openpyxl')\n",
"Triples_target_final = pd.read_excel(\"./Data/Input/DISNET/triples_final_drege_repo_csbj.xlsx\",engine='openpyxl')\n",
"Triples_target_final = Triples_target_final.drop(columns=['Unnamed: 0'])\n",
"Triples_target_final = Triples_target_final.drop_duplicates()"
]
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment