Chargeback Fraud Detection
Solution Background and Business Value
Chargeback fraud is a significant issue for e-commerce platforms and online retailers. This occurs when a customer makes a purchase using their credit card and later disputes the transaction with their bank, claiming it was unauthorized or fraudulent. If the bank approves the chargeback, the transaction is reversed, and the merchant may bear the financial loss.
Machine learning models can help detect and prevent chargeback fraud before it happens, allowing businesses to:
-
Reduce fraudulent transactions by identifying high-risk purchases early.
-
Minimize financial losses by preventing chargebacks from occurring.
-
Improve fraud detection processes by integrating ML into fraud prevention systems.
Data Requirements and Schema
Kumo AI can analyze data in its raw relational form, meaning we can directly use tables without extensive feature engineering. Graph Neural Networks (GNNs) leverage relationships between entities (e.g., users, orders, chargebacks) to improve fraud detection accuracy.
Core Tables
-
Accounts Table
-
Stores user account details.
-
Key attributes:
-
account_id
: Unique identifier. -
Optional: Creation date, location, age, account type.
-
-
-
Orders Table
-
Stores details of each order.
-
Key attributes:
-
order_id
: Unique order identifier. -
account_id
: Links the order to a user. -
timestamp
: Time of purchase. -
Optional: Order value, payment method, shipping details.
-
-
-
Chargebacks Table
-
Stores information about chargeback claims.
-
Key attributes:
-
chargeback_id
: Unique identifier. -
order_id
: Links the chargeback to an order. -
timestamp
: Time of chargeback request. -
label
: Indicates whether the chargeback was fraudulent (1) or legitimate (0).
-
-
Additional Tables (Optional)
-
Items Table: Stores item-level details within an order.
-
Order-Items Table: Links orders to specific items purchased.
-
Payment Methods Table: Stores payment details (e.g., card type, account linkage).
-
Merchants Table: Information on merchants selling products.
-
Account Events Table: Tracks user account activity.
Entity Relationship Diagram (ERD)
Predictive Queries
We can detect chargeback fraud at different levels:
1. Predict Fraudulent Chargebacks
This model predicts whether a chargeback is fraudulent:
- At inference time, we leave
LABEL
empty for new chargebacks and generate fraud risk scores.
2. Predict Fraud Risk at the Order Level
To anticipate fraud at the order level, we move the fraud label to the orders
table:
3. Predict Future Chargeback Fraud
For proactive fraud detection, we can predict whether an order or account will experience a fraudulent chargeback in the next X days:
Deployment Strategy
The best deployment strategy depends on fraud detection system maturity:
1. Batch Predictions for Fraud Analysts
-
Fraud teams manually review and label chargebacks.
-
ML model predictions prioritize high-risk chargebacks for faster action.
-
Predictions are generated daily or hourly in batch mode.
2. Real-Time Chargeback Fraud Detection
-
The system generates real-time risk scores when an order is placed.
-
If a transaction is high risk, additional verification or manual review is triggered.
-
ML embeddings are used to enhance rule-based fraud detection.
Building models in Kumo SDK
1. Initialize the Kumo SDK
2. Connect data
3. Select tables
4. Create graph schema
5. Train the model