Case Study: Compare ML to AI models on Imbalanced Data
Updated: Jun 28, 2020
Compare Machine Learning Models to AI Models using the credit_card_fraud.csv Data-set
Early in my data science career, I did some work on a publicly available data-set with imbalanced labels. My goal was to find an optimal approach to tackle this problem, and now I want to share what I learned along the way with you. Some of my initial questions were whether ML or AI would give better results, if any model that analyzes an imbalanced data-set can still solve for the minority case, and how to get the best results from any model, among many others.
I have made quite a few updates to this notebook over the years. As I learned different DS techniques, I would try to apply them to this case study. So you will find examples for many different techniques that are good for any Data Scientist to know, including RF, LR, GBM, ANN, DNN, CNN, GridSearchCV, cross_validation, plotly. This is the most current version of the notebook as of June 20, 2020. Please feel free to comment or discuss any ideas you may have.