Relational deep learning is the paradigm of learning directly from relational databases by treating them as temporal heterogeneous graphs. Instead of the traditional workflow (months of feature engineering to flatten multi-table databases into single training tables), RDL preserves the full relational structure. Each table becomes a node type. Each row becomes a node. Each foreign key becomes an edge. Graph neural networks extract cross-table patterns automatically.
The traditional ML pipeline on relational data
Today, most enterprise ML follows this workflow:
- Understand the schema: study 10-50 tables, their relationships, and business meaning (weeks)
- Feature engineering: write SQL to join tables, compute aggregations (avg order amount, count of orders in 30 days, max product price per customer), and build a flat feature table (months)
- Train a model: XGBoost or LightGBM on the flat table (days)
- Maintain features: when the schema changes, when new tables are added, when business logic evolves, re-engineer everything (ongoing)
Steps 1, 2, and 4 consume 80% of the time and are where information is lost. Aggregating orders into “average order amount” destroys the distribution. Flattening to a single table loses the multi-hop relationships (customer → order → product → category).
The RDL approach
Relational deep learning replaces steps 1-2 with automatic graph construction:
# Conceptual: relational database → heterogeneous graph
# Tables become node types
node_types = ['customer', 'order', 'product', 'category']
# Foreign keys become edge types
edge_types = [
('customer', 'places', 'order'), # customer_id FK in orders
('order', 'contains', 'product'), # product_id FK in order_items
('product', 'belongs_to', 'category') # category_id FK in products
]
# Row columns become node features
# customer: [age, location, signup_date, ...]
# order: [amount, timestamp, channel, ...]
# product: [price, weight, rating, ...]
# Result: a temporal heterogeneous graph
# - Heterogeneous: multiple node/edge types
# - Temporal: edges carry timestamps (order dates)The conversion is mechanical: read the schema, map tables to node types, map foreign keys to edge types. No domain expertise required.
What message passing discovers
Consider predicting customer churn in an e-commerce database with customers, orders, products, and support tickets tables:
- Layer 1: each customer node aggregates its orders (recency, frequency, monetary) and support tickets (count, severity, resolution status). This approximates basic RFM features.
- Layer 2: each customer also sees the products from their orders (categories, price ranges, return rates) and the resolution patterns of similar support tickets. This captures cross-table patterns.
- Layer 3: each customer sees the behaviors of other customers who bought the same products. Customers whose product-neighbors have high churn rates are themselves at risk. This is a 3-hop pattern invisible to flat-table ML.
These patterns are discovered automatically. No data scientist manually computed “average return rate of products purchased by customer.” The GNN learned that this signal predicts churn through message passing.
RelBench: the benchmark
RelBench is the standard benchmark for relational deep learning, containing 7 real relational databases with 30 prediction tasks across 103 million rows:
- Databases: e-commerce, healthcare, reviews, academic, and more
- Tasks: churn, LTV, fraud, recommendation, demand forecasting
- Temporal splits: train on historical data, validate and test on future data with strict temporal integrity
Results demonstrate the RDL advantage:
- Flat-table LightGBM: 62.44 average AUROC (after expert feature engineering)
- GNN on relational graph: 75.83 average AUROC (no feature engineering)
- KumoRFM zero-shot: 76.71 average AUROC (no training on the target database)
- KumoRFM fine-tuned: 81.14 average AUROC (fine-tuned on target database)
Temporal integrity
A critical advantage of RDL is built-in temporal correctness. Rows in relational databases have timestamps (order_date, created_at, updated_at). In the temporal heterogeneous graph, these timestamps determine which edges are visible at prediction time.
If you are predicting whether customer Alice will churn at time T, message passing only uses orders placed before T, products purchased before T, and support tickets filed before T. No future information leaks in. This temporal integrity is automatic; in traditional feature engineering, preventing leakage requires careful manual timestamp filtering at every aggregation step.