This project is part of the Full Stack ML in AWS course by AICamp, and is based on the data preparation steps outlined in the previous article on churn prediction. Let’s have a look at some definitions first.

How does the XGBoost algorithm work?

XGBoost is a variation of gradient boosted decision trees. These algorithms are based on a much simpler algorithm, the decision tree. In decision trees, new data points are assigned to predicted categories based on subsequent splits at the tree nodes. To decide where to split, the decision tree follows the path of maximum information gain, or least entropy, which means choosing the…

This article is a summary of one of the case studies in the Full Stack ML course by AICamp. The problem is that of predicting customer churn, which is the fraction of customers lost by a business.

Feature engineering

The first step towards data preparation is to gather all your data in one single table, and apply feature engineering (the set of techniques used to transform the raw data) to obtain features in a format that can be used by the model.

For AWS SageMaker specifically, you need to prepare the data in a specific format: the first column should contain the…

Despite the increasing adoption of AI and the growing investment in AI software by all industries, recent reports have highlighted the difficulties AI projects face when going from proof of concept to production, with nearly 50% of projects failing to be fully operationalized. MLOps methodologies are currently being developed to better structure ML projects and ensure they succeed. There is a growing number of online resources to know more about MLOps, and I found the Full Stack ML course, offered by AICamp, to be a great starting point. …

