Update README.md

666fe6e1 · Joaquin Torres · 7ff8fa5c · 666fe6e1
Commit 666fe6e1 authored Jul 11, 2024 by Joaquin Torres
Hide whitespace changes
Inline Side-by-side

Showing with 31 additions and 22 deletions

README.md README.md +31 -22

No files found.
--- a/README.md
+++ b/README.md
@@ -35,28 +35,37 @@ These approaches resulted in multiple training datasets. However, to ensure a fa

 This repository is organized as follows:

-* [EDA](./EDA):
-  * [results](./EDA/results): Plots about feature distributions, correlations and importance.
-  * [EDA.ipynb](./EDA/EDA.ipynb): Exploring and filtering data, handling missing values, encoding variables, building the final pre- and post- pandemic datasets, and generating plots for feature distributions, correlations and importance.
-* [gen_train_data](./gen_train_data):
-  * [gen_train_data.ipynb](./gen_train_data/gen_train_data.ipynb): Generating training and testing datasets for each of the pipelines.
-* [model_building](./model_building):
-  * [hyperparam_tuning.py](./model_building/hyperparam_tuning.py): Tuning models through a random search of hyperparameters.
-  * [cv_metric_gen.py](./model_building/cv_metric_gen.py): Generating cross-validation metrics and plots for each of the tuned models.
-  * [cv_metrics_distr.py](./model_building/cv_metrics_distr.py): Generating boxplots for each cross-validation metric and tuned model.
-  * [test_models.py](./model_building/test_models.py): Testing tuned models with test dataset.
-  * [fit_final_models.py](./model_building/fit_final_models.py): Saving fitted model for each selected final model.
-  * [results](./model_building/results):
-    * [hyperparam](./model_building/output/hyperparam): Excel file containing the optimal hyperparameters for each model in each pipeline.
-    * [cv_metrics](./model_building/output/cv_metrics): Material related to the results of cross-validation: scores, ROC and Precision-Recall curves and boxplots for each metric.
-    * [testing](./model_building/output/testing): Material related to the results of testing the tuned models: scores, ROC and Precision-Recall curves and confusion matrices.
-    * [fitted_models](./model_building/output/fitted_models): Final selected trained models.
-* [explainability](./explainability):
-  * [compute_shap_vals.py](./explainability/compute_shap_vals.py): Computing SHAP values for final models.
-  * [compute_shap_inter_vals.py](./explainability/compute_shap_inter_vals.py): Computing SHAP interaction values for final models.
-  * [shap_plots.py](./explainability/shap_plots.py): Generating SHAP summary plots for the SHAP and SHAP interaction values computed. Comparing major differences between pre- and post-pandemic groups.
-  * [results](./explainability/results): SHAP and SHAP interaction summary plots.
-
+* [01-EDA](./01-EDA):
+  * [EDA.ipynb](./01-EDA/EDA.ipynb): Exploring and filtering data, handling missing values, encoding variables, building the final pre- and post-pandemic datasets, and generating plots for feature distributions, correlations and importance.
+  * [results](./01-EDA/results): 
+    * [feature_names](./01-EDA/results/feature_names): Names of the selected individual and social variables.
+    * [plots](./01-EDA/results/plots)
+        * [correlations](./01-EDA/results/plots/correlations): Heatmaps to visualize the pairwise correlations beetween features.
+        * [distributions](./01-EDA/results/plots/distributions): Statistical plots to visualize the distribution of features.
+        * [feature_importance](./01-EDA/results/plots/feature_importance): Plots to show the importance of each feature in predicting outcomes.
+* [02-training_data_generation](./02-training_data_generation):
+  * [training_data_generation.ipynb](./02-training_data_generation/training_data_generation.ipynb): Generating training and testing datasets for each of the pipelines.
+* [03-model_building](./03-model_building):
+  * [hyperparameter_tuning.py](./03-model_building/hyperparameter_tuning.py): Tuning models through a random search of hyperparameters.
+  * [cv_metric_generation.py](./03-model_building/cv_metric_generation.py): Generating cross-validation metrics and plots for each of the tuned models.
+  * [cv_metric_distribution.py](./03-model_building/cv_metric_distribution.py): Generating boxplots for each cross-validation metric and tuned model.
+  * [models_testing.py](./03-model_building/models_testing.py): Testing tuned models with test dataset.
+  * [models_final_fitting.py](./03-model_building/models_final_fitting.py): Saving fitted model for each selected final model.
+  * [results](./03-model_building/results):
+    * [hyperparam](./03-model_building/output/hyperparam): Excel file containing the optimal hyperparameters for each model in each pipeline.
+    * [cv_metrics](./03-model_building/output/cv_metrics): Material related to the results of cross-validation: scores, ROC and Precision-Recall curves and boxplots for each metric.
+    * [testing](./03-model_building/output/testing): Material related to the results of testing the tuned models: scores, ROC and Precision-Recall curves and confusion matrices.
+    * [fitted_models](./03-model_building/output/fitted_models): Final selected trained models.
+* [04-explainability](./04-explainability):
+  * [shap_vals_computation.py](./04-explainability/shap_vals_computation.py): Computing SHAP values for final models.
+  * [shap_inter_vals_computation.py](./04-explainability/shap_inter_vals_computation.py): Computing SHAP interaction values for final models.
+  * [shap_plots.ipynb](./04-explainability/shap_plots.ipynb): Generating SHAP summary plots for the SHAP and SHAP interaction values computed. Comparing major differences between pre- and post-pandemic groups.
+  * [results](./04-explainability/results):
+    * [plots](./04-explainability/plots): SHAP summary and summary interaction plots as well as 
+        * [shap_summary](./04-explainability/plots/shap_summary): SHAP summary plots.
+        * [shap_inter_summary](./04-explainability/plots/shap_inter_summary): SHAP summary interaction plots.
+        * [heatmaps_interactions](./04-explainability/plots/heatmaps_interactions): Heatmaps representing the differences in interactions between pre-pandemic and post-pandemic groups.
+        
 ## Contact

 For any inquiry you can contact: