Top Spender – Optimove

PREDICTIVE

🎯 Purpose & Use-Cases

This model is designed to predict the probability that a customer will become a top spender - defined as customers ranking in the top percentile for lifetime purchase/deposit amount over the next 3 months.

VIP Program Development
Identify potential high-value customers for exclusive tier programs

Strategic Customer Investment
Enable proactive targeting to:

Nurture promising customers before they reach top-spender status
Allocate premium support and personalized experiences
Invest in relationship-building with future high-value prospects

🔍 Model Scope

Population Lifecycle Stages: Customers in all lifecycle stages besides "Dormant"

🔄 Update Frequency

Training: Runs every iteration (unless data drift/changes require retraining)
Inference: Runs daily in the daily process

📊 Output

The following outputs are stored in the customer profiles (as internal fields):

Becoming Top Spender Score: Probability of this customer to become a top spender within the next 3 months
Percentile rank in Becoming Top Spender: Percentile rank among all customers by Becoming Top Spender Score.
On a scale of 1 (lowest) to 100 (highest)
Permille rank in Becoming Top Spender: Permille rank among all customers by Becoming Top Spender Score.
On a scale of 1 (lowest) to 1000 (highest), providing higher-resolution segmentation than the percentile ranking
Percentile rank by LCS Becoming Top Spender: Percentile rank among same lifecycle stage by Becoming Top Spender Score.
On a scale of 1 (lowest) to 100 (highest)
Permille rank by LCS Becoming Top Spender: Permille rank among same lifecycle stage by Becoming Top Spender Score.
On a scale of 1 (lowest) to 1000 (highest), providing higher-resolution segmentation than the percentile ranking

⚙️ Model Specifications

Training Window: default = 20 iterations
Prediction Horizon: default = 6 iterations

🧠 How It Works

Catboost Classifier, which is the state-of-the-art model for tabular data
Individual models per lifecycle stage for optimal accuracy
Hyperparameters tuning in the first model training for optimal model parameters

✨ Data Quality & Automation

Data Cleaning & Imputation: Handles missing values, removes constant features, and groups rare categorical values

Feature Engineering & Encoding: Converts categorical variables to numerical format using target-based encoding and applies scaling transformations

Feature Selection & Optimization: Removes low-variance and highly correlated features, selecting the most predictive variables

📁 Data Requirements

Historical customer profiles (customer activity data, demographic data, campaign history, etc.)
Minimum 7 iterations of historical data required for training (1 iteration for feature generation and 6 iterations for target definition, given the 6-iteration prediction horizon)
Minimum number of 500 customers per lifecycle stage required for training
Minimum number of 50 customers per lifecycle stage that became top spenders

Related articles