Given the rapid increase in computing power and data-driven approaches to tackle many real-world problems today, ML has become an integral part of many solutions.As per a Kaggle survey in 2019, boosting algorithms are among the top 3 preferred methods used by data scientists. Their popularity is due to their robustness against overfitting, faster training times, ability to handle multimodal data while leaving a small memory footprint. In this paper, we compare four state-of-the-art gradient boosting algorithms XGBoost, CatBoost, LightGBM, and SnapBoost on 4 diverse datasets. We perform this competitive analysis on the IBM PowerAI AC922 server. This platform helps end-users experience faster iterations and training than the standard x86. Finally, we present the accuracy and training times of all the algorithms across the 4 datasets. We perform analysis using two approaches; One with only the baseline algorithms, and the other with systematic Hyperparameter Optimization with HyperOpt.
Continue the conversation in Slack