·
data & other dangerous things
← All investigations
Propensity Modelling

who will leave next?

a telco was firefighting churn after it happened. a propensity model moved the fight forward by 90 days.

12 min read
The Question

can we predict, with enough lead time and confidence, which customers are about to churn — and is intervening on them actually worth it?

The Hypothesis

churn is predictable 60–90 days out from usage decay patterns, but the real question is uplift: would the highest-risk customers have left anyway regardless of intervention?

The Data

18 months of billing, usage, support and nps data for 1.2m customers. labelled churn events with a 90-day forward window. held out the final 3 months as a temporal test set.

The Analysis

trained gradient-boosted trees (lightgbm) on 140 features with shap for explanation. ran a follow-up uplift model (two-model approach) on a randomised retention-offer experiment to separate persuadables from sure things and lost causes.

The Decision

targeted only the persuadable decile with a tailored retention offer. stopped sending offers to high-risk customers who the uplift model said would either stay or leave regardless.

The Outcome

reduced retention-offer spend by 42% while improving net saved customers by 11%. the surprise: the top churn-risk decile contained the worst offer-takers — they were leaving no matter what.

Further reading

the question

the retention team was sending offers to anyone who looked likely to leave. the offers were expensive. nobody had asked whether the people accepting them would have stayed anyway.

the hypothesis

churn is predictable 60–90 days out — but propensity is the wrong target. uplift is.

the data

18 months of billing, usage, support and nps for 1.2m customers. churn labelled on a 90-day forward window. the final three months held out as a temporal test set so the model could not cheat with future-leaking features.

the analysis

lightgbm on 140 features, shap for explanation, calibrated probabilities. then a two-model uplift estimator trained on a randomised retention-offer experiment to separate persuadables from sure things and lost causes.

the decision

stop targeting "high risk". start targeting "high uplift". the persuadable decile got a tailored offer; the rest got left alone.

the outcome

retention-offer spend down 42%. net saved customers up 11%. the highest-risk decile turned out to be the worst offer-takers — they were leaving regardless, and we had been paying them to do it.

tools used

python · lightgbm · shap · causalml · airflow