Renamed model_selection to model_building

d18d5981 · Joaquin Torres · 0e3f17be · d18d5981 · d18d5981 · d18d5981
Commit d18d5981 authored Jul 10, 2024 by Joaquin Torres
43 changed files
--- a/README.md
+++ b/README.md
@@ -40,17 +40,17 @@ This repository is organized as follows:
  * [EDA.ipynb](./EDA/EDA.ipynb): Exploring and filtering data, handling missing values, encoding variables, building the final pre- and post- pandemic datasets, and generating plots for feature distributions, correlations and importance.
 * [gen_train_data](./gen_train_data):
  * [gen_train_data.ipynb](./gen_train_data/gen_train_data.ipynb): Generating training and testing datasets for each of the pipelines.
-* [model_selection](./model_selection):
-  * [hyperparam_tuning.py](./model_selection/hyperparam_tuning.py): Tuning models through a random search of hyperparameters.
-  * [cv_metric_gen.py](./model_selection/cv_metric_gen.py): Generating cross-validation metrics and plots for each of the tuned models.
-  * [cv_metrics_distr.py](./model_selection/cv_metrics_distr.py): Generating boxplots for each cross-validation metric and tuned model.
-  * [test_models.py](./model_selection/test_models.py): Testing tuned models with test dataset.
-  * [fit_final_models.py](./model_selection/fit_final_models.py): Saving fitted model for each selected final model.
-  * [results](./model_selection/results):
-    * [hyperparam](./model_selection/output/hyperparam): Excel file containing the optimal hyperparameters for each model in each pipeline.
-    * [cv_metrics](./model_selection/output/cv_metrics): Material related to the results of cross-validation: scores, ROC and Precision-Recall curves and boxplots for each metric.
-    * [testing](./model_selection/output/testing): Material related to the results of testing the tuned models: scores, ROC and Precision-Recall curves and confusion matrices.
-    * [fitted_models](./model_selection/output/fitted_models): Final selected trained models.
+* [model_building](./model_building):
+  * [hyperparam_tuning.py](./model_building/hyperparam_tuning.py): Tuning models through a random search of hyperparameters.
+  * [cv_metric_gen.py](./model_building/cv_metric_gen.py): Generating cross-validation metrics and plots for each of the tuned models.
+  * [cv_metrics_distr.py](./model_building/cv_metrics_distr.py): Generating boxplots for each cross-validation metric and tuned model.
+  * [test_models.py](./model_building/test_models.py): Testing tuned models with test dataset.
+  * [fit_final_models.py](./model_building/fit_final_models.py): Saving fitted model for each selected final model.
+  * [results](./model_building/results):
+    * [hyperparam](./model_building/output/hyperparam): Excel file containing the optimal hyperparameters for each model in each pipeline.
+    * [cv_metrics](./model_building/output/cv_metrics): Material related to the results of cross-validation: scores, ROC and Precision-Recall curves and boxplots for each metric.
+    * [testing](./model_building/output/testing): Material related to the results of testing the tuned models: scores, ROC and Precision-Recall curves and confusion matrices.
+    * [fitted_models](./model_building/output/fitted_models): Final selected trained models.
 * [explainability](./explainability):
  * [compute_shap_vals.py](./explainability/compute_shap_vals.py): Computing SHAP values for final models.
  * [compute_shap_inter_vals.py](./explainability/compute_shap_inter_vals.py): Computing SHAP interaction values for final models.

--- a/explainability/compute_shap_inter_vals.py
+++ b/explainability/compute_shap_inter_vals.py
@@ -70,7 +70,7 @@ if __name__ == "__main__":
            print(f"{group}-{method_names[j]}")
            method_name = method_names[j]
            model_name = model_choices[method_name]
-            model_path = f"../model_selection/results/fitted_models/{group}_{method_names[j]}_{model_name}.pkl"
+            model_path = f"../model_building/results/fitted_models/{group}_{method_names[j]}_{model_name}.pkl"
            # Load the fitted model from disk
            with open(model_path, 'rb') as file:
                fitted_model = pickle.load(file)

--- a/explainability/compute_shap_vals.py
+++ b/explainability/compute_shap_vals.py
@@ -70,7 +70,7 @@ if __name__ == "__main__":
            print(f"{group}-{method_names[j]}")
            method_name = method_names[j]
            model_name = model_choices[method_name]
-            model_path = f"../model_selection/results/fitted_models/{group}_{method_names[j]}_{model_name}.pkl"
+            model_path = f"../model_building/results/fitted_models/{group}_{method_names[j]}_{model_name}.pkl"
            # Load the fitted model from disk
            with open(model_path, 'rb') as file:
                fitted_model = pickle.load(file)

--- a/model_selection/cv_metric_distr.py
+++ b/model_selection/cv_metric_distr.py
--- a/model_selection/cv_metric_gen.py
+++ b/model_selection/cv_metric_gen.py
--- a/model_selection/fit_final_models.py
+++ b/model_selection/fit_final_models.py
--- a/model_selection/hyperparam_tuning.py
+++ b/model_selection/hyperparam_tuning.py
--- a/model_selection/results/cv_metrics/curves/post_ORIG.svg
+++ b/model_selection/results/cv_metrics/curves/post_ORIG.svg
--- a/model_selection/results/cv_metrics/curves/post_ORIG_CW.svg
+++ b/model_selection/results/cv_metrics/curves/post_ORIG_CW.svg
--- a/model_selection/results/cv_metrics/curves/post_OVER.svg
+++ b/model_selection/results/cv_metrics/curves/post_OVER.svg
--- a/model_selection/results/cv_metrics/curves/post_UNDER.svg
+++ b/model_selection/results/cv_metrics/curves/post_UNDER.svg
--- a/model_selection/results/cv_metrics/curves/pre_ORIG.svg
+++ b/model_selection/results/cv_metrics/curves/pre_ORIG.svg
--- a/model_selection/results/cv_metrics/curves/pre_ORIG_CW.svg
+++ b/model_selection/results/cv_metrics/curves/pre_ORIG_CW.svg
--- a/model_selection/results/cv_metrics/curves/pre_OVER.svg
+++ b/model_selection/results/cv_metrics/curves/pre_OVER.svg
--- a/model_selection/results/cv_metrics/curves/pre_UNDER.svg
+++ b/model_selection/results/cv_metrics/curves/pre_UNDER.svg
--- a/model_selection/results/cv_metrics/distributions/post_ORIG.svg
+++ b/model_selection/results/cv_metrics/distributions/post_ORIG.svg
--- a/model_selection/results/cv_metrics/distributions/post_ORIG_CW.svg
+++ b/model_selection/results/cv_metrics/distributions/post_ORIG_CW.svg
--- a/model_selection/results/cv_metrics/distributions/post_OVER.svg
+++ b/model_selection/results/cv_metrics/distributions/post_OVER.svg
--- a/model_selection/results/cv_metrics/distributions/post_UNDER.svg
+++ b/model_selection/results/cv_metrics/distributions/post_UNDER.svg
--- a/model_selection/results/cv_metrics/distributions/pre_ORIG.svg
+++ b/model_selection/results/cv_metrics/distributions/pre_ORIG.svg
--- a/model_selection/results/cv_metrics/distributions/pre_ORIG_CW.svg
+++ b/model_selection/results/cv_metrics/distributions/pre_ORIG_CW.svg
--- a/model_selection/results/cv_metrics/distributions/pre_OVER.svg
+++ b/model_selection/results/cv_metrics/distributions/pre_OVER.svg
--- a/model_selection/results/cv_metrics/distributions/pre_UNDER.svg
+++ b/model_selection/results/cv_metrics/distributions/pre_UNDER.svg
--- a/model_selection/results/cv_metrics/metrics.xlsx
+++ b/model_selection/results/cv_metrics/metrics.xlsx
--- a/model_selection/results/fitted_models/post_ORIG_CW_RF.pkl
+++ b/model_selection/results/fitted_models/post_ORIG_CW_RF.pkl
--- a/model_selection/results/fitted_models/post_ORIG_XGB.pkl
+++ b/model_selection/results/fitted_models/post_ORIG_XGB.pkl
--- a/model_selection/results/fitted_models/post_OVER_XGB.pkl
+++ b/model_selection/results/fitted_models/post_OVER_XGB.pkl
--- a/model_selection/results/fitted_models/post_UNDER_XGB.pkl
+++ b/model_selection/results/fitted_models/post_UNDER_XGB.pkl
--- a/model_selection/results/fitted_models/pre_ORIG_CW_RF.pkl
+++ b/model_selection/results/fitted_models/pre_ORIG_CW_RF.pkl
--- a/model_selection/results/fitted_models/pre_ORIG_XGB.pkl
+++ b/model_selection/results/fitted_models/pre_ORIG_XGB.pkl
--- a/model_selection/results/fitted_models/pre_OVER_XGB.pkl
+++ b/model_selection/results/fitted_models/pre_OVER_XGB.pkl
--- a/model_selection/results/fitted_models/pre_UNDER_XGB.pkl
+++ b/model_selection/results/fitted_models/pre_UNDER_XGB.pkl
--- a/model_selection/results/hyperparam/hyperparamers.xlsx
+++ b/model_selection/results/hyperparam/hyperparamers.xlsx
--- a/model_selection/results/testing/plots/post_ORIG.svg
+++ b/model_selection/results/testing/plots/post_ORIG.svg
--- a/model_selection/results/testing/plots/post_ORIG_CW.svg
+++ b/model_selection/results/testing/plots/post_ORIG_CW.svg
--- a/model_selection/results/testing/plots/post_OVER.svg
+++ b/model_selection/results/testing/plots/post_OVER.svg
--- a/model_selection/results/testing/plots/post_UNDER.svg
+++ b/model_selection/results/testing/plots/post_UNDER.svg
--- a/model_selection/results/testing/plots/pre_ORIG.svg
+++ b/model_selection/results/testing/plots/pre_ORIG.svg
--- a/model_selection/results/testing/plots/pre_ORIG_CW.svg
+++ b/model_selection/results/testing/plots/pre_ORIG_CW.svg
--- a/model_selection/results/testing/plots/pre_OVER.svg
+++ b/model_selection/results/testing/plots/pre_OVER.svg
--- a/model_selection/results/testing/plots/pre_UNDER.svg
+++ b/model_selection/results/testing/plots/pre_UNDER.svg
--- a/model_selection/results/testing/testing_tuned_models.xlsx
+++ b/model_selection/results/testing/testing_tuned_models.xlsx
--- a/model_selection/test_models.py
+++ b/model_selection/test_models.py