A relational foundation model (RFM) is a foundation model specifically designed and pre-trained on relational database structure. It understands that enterprise data lives in tables connected by foreign keys, that each row is an entity with typed features, and that events happen in temporal order. This specialization allows RFMs to outperform both general-purpose ML models and generic graph foundation models on enterprise prediction tasks.
From databases to graphs
The first step in an RFM pipeline is converting the relational database into a heterogeneous temporal graph:
- Tables become node types: Each table (customers, orders, products) is a distinct node type with its own feature space.
- Rows become nodes: Each row in a table is a node. A 1M-row customer table produces 1M customer nodes.
- Foreign keys become edges: Each foreign key relationship (order.customer_id references customers.id) produces edges connecting order nodes to customer nodes.
- Timestamps become temporal ordering: Edges and nodes have timestamps that enable temporal sampling and time encoding.
# Conceptual: relational database -> heterogeneous temporal graph
# KumoRFM does this automatically from your database schema
# Tables -> node types
graph.add_node_type('customer', features=['age', 'tenure', 'segment'])
graph.add_node_type('order', features=['amount', 'discount', 'channel'])
graph.add_node_type('product', features=['price', 'category', 'brand'])
# Foreign keys -> edge types
graph.add_edge_type('customer', 'placed', 'order',
timestamp_col='order_date')
graph.add_edge_type('order', 'contains', 'product')
# Result: heterogeneous temporal graph
# Nodes: 1M customers + 5M orders + 100K products
# Edges: 5M placed + 8M contains
# Each edge has a timestamp for temporal samplingThis conversion is automatic. The RFM reads the database schema and foreign key constraints to build the graph.
The architecture
An RFM uses a graph transformer architecture that is specifically designed for heterogeneous relational data:
- Type-specific encoders: Each node type gets its own input projection, handling different feature dimensions and types across tables.
- Relational graph transformer: Attention operates across node types, with relation-type-aware attention weights. The model learns that “customer placed order” and “order contains product” have different semantics.
- Temporal awareness: Time encodings and temporal sampling ensure the model respects causal ordering. No future information leaks into representations.
- Schema-agnostic design: The architecture handles arbitrary numbers of tables, columns, and relationships. A new database with a never-seen schema can be processed without architectural changes.
Pre-training at scale
RFMs are pre-trained on diverse relational databases using masked token prediction:
- E-commerce databases: customer behavior, product interactions, purchase patterns
- Financial databases: transactions, accounts, merchant networks
- Healthcare databases: patients, visits, diagnoses, treatments
- SaaS databases: users, sessions, features, subscriptions
The diversity of pre-training data is critical. Each database teaches the model different relational patterns, and the model learns to generalize across them. KumoRFM was pre-trained on 103 million rows across 7 diverse databases.
Zero-shot prediction with PQL
The user interface for an RFM is a single line of Predictive Query Language (PQL):
# Churn prediction
PREDICT churn FOR customers WITHIN 30 days
# Lifetime value
PREDICT SUM(orders.amount) FOR customers WITHIN 90 days
# Fraud detection
PREDICT is_fraud FOR transactions
# Product recommendation
PREDICT link(customer, product) FOR customersOne line per prediction task. No feature engineering, no model training, no ML pipeline. The RFM handles everything.
Results: RelBench benchmark
RelBench is the standard benchmark for relational deep learning, covering 7 databases and 30 prediction tasks:
- Flat-table LightGBM: 62.44 AUROC (trained per task)
- Task-specific GNN: 75.83 AUROC (trained per task)
- KumoRFM zero-shot: 76.71 AUROC (no target training)
- KumoRFM fine-tuned: 81.14 AUROC (minutes of adaptation)
The zero-shot RFM outperforms models that had full access to the training data. This is the strongest evidence that relational patterns transfer across databases.