diff --git a/README.md b/README.md index 6e8c8486e13b367c90143b57ff80a5aa87474c70..7b7bcf598858ed425558891c604d66a56cfd0c0e 100644 --- a/README.md +++ b/README.md @@ -3,7 +3,9 @@ [Intro] This repository is organized as follows: -* [EDA](./EDA): Exploring data, handling missing values, analyzing variable distribution, filtering data, feature selection, encoding variables, and performing correlation analysis. +* [EDA](./EDA): + * [output](./EDA/output): Plots about feature distributions, correlations and importance. + * [EDA.ipynb](./EDA/EDA.ipynb): Notebook used for exploring and filtering data, handling missing values, encoding variables, building the final pre- and post- pandemic datasets, and generating plots for feature distributions, correlations and importance. * [gen_train_data](./gen_train_data): Generating training and testing datasets [wait for final approach TBC]. * [model_selection](./model_selection): Tuning models, generating metrics from cross-validation and testing, and selecting final models. * [hyperparam_tuning.py](./model_selection/hyperparam_tuning.py): Tunes models through a random search of hyperparameters.