Predicting Chargebacks Before They Happen: Our ML Approach

The problem

Chargebacks are the industry's lagging indicator of fraud — by the time a chargeback arrives, typically 30–120 days after the original transaction, the fraudster has long since withdrawn the funds and closed the account. The traditional response — use chargebacks to label fraud and train models — creates a model that is perpetually fighting the last war.

We asked a different question: can we predict which transactions will result in chargebacks before the chargeback is filed? If so, we could take preventative action — additional verification, withdrawal holds, account review — while intervention is still possible.

Training data

The training dataset comprised 6.2M transactions across 24 months, with chargeback outcomes matched back to original transactions via payment network dispute records. Each transaction was labelled with the chargeback outcome (no chargeback, chargeback filed, chargeback won, chargeback lost) and the time-to-chargeback if applicable.

The core technical challenge was feature engineering: we needed to capture signals that were present at transaction time but predictive of chargebacks that would arrive weeks or months later. This required identifying leading indicators — behavioural patterns that preceded chargebacks in the training data but were not direct indicators of fraud at the time.

Model architecture

The chargeback prediction model is a two-stage architecture. Stage 1 is a binary classifier (will this transaction result in a dispute?) with a 72-hour prediction horizon. Stage 2 is a regression model that estimates time-to-dispute for transactions that Stage 1 classifies as high-risk. The combined output allows for priority-ranked intervention queues — highest urgency first.

Results

72hrs

Average prediction horizon

83%

Precision at 10% recall threshold

-34%

Chargeback rate after 6 months deployment

Production deployment

The chargeback prediction model runs as an asynchronous post-authorisation job — it does not sit in the real-time transaction path and therefore has no latency impact on checkout. High-risk predictions trigger an alert to the fraud operations queue within 2 seconds of authorisation, giving the fraud team the full 72-hour window to investigate before the predicted dispute arrives.

ChargebacksMLPredictionEngineeringDisputes

Want results like these?

Get a free risk audit in 48 hours. No integration required.

← Previous

Serixo Raises €8M Series A to Scale AI Fraud Infrastructure

Policy Engine: Visual Rule Builder with Zero-Code Deployment