Berlin Tech Meetup: The Future of Relational Foundation Models, Systems, and Real-World Applications

Register now:
PyG/Guide7 min read

Zero-Shot Prediction: Making Predictions Without Task-Specific Training

Zero-shot prediction generates accurate predictions on tasks the model has never been trained on. Foundation models pre-trained on diverse relational data transfer their knowledge to any new database without any labels or task-specific training.

PyTorch Geometric

TL;DR

  • 1Zero-shot prediction makes predictions on tasks never seen during training. No task-specific labels, no fine-tuning, no data science pipeline. Point the model at data and get predictions.
  • 2Foundation models enable this by learning universal relational patterns during pre-training on diverse databases. These patterns transfer: declining engagement predicts churn across any industry.
  • 3KumoRFM achieves 76.71 AUROC zero-shot on RelBench (7 databases, 30 tasks), outperforming task-specific LightGBM models (62.44) that require manual feature engineering per task.
  • 4Zero-shot is ideal for: new databases with no labels, rapid prototyping, exploratory analysis, and initial deployment before labels accumulate for fine-tuning.
  • 5The progression: zero-shot first (immediate predictions), then few-shot (as labels trickle in), then fine-tuned (maximum accuracy with accumulated labels). Each stage improves on the last.

Zero-shot prediction makes accurate predictions on tasks never seen during training. No task-specific labels. No fine-tuning. No feature engineering. You point a foundation model at a new database and it generates predictions immediately. This is possible because graph foundation models learn universal relational patterns during pre-training: what declining customer engagement looks like, how transaction velocity anomalies predict fraud, how multi-hop relational patterns correlate with outcomes.

When the model encounters a new e-commerce database it has never seen, it recognizes the relational structure (customers linked to orders linked to products) and applies its learned patterns. Customers whose order frequency is declining, whose product preferences are shifting, and whose support interactions are increasing are likely churners. The model knows this from pre-training on other databases where similar patterns preceded churn.

How zero-shot works for relational data

zero_shot_kumo.py
# KumoRFM zero-shot prediction (conceptual)

# 1. Point at your database
database = connect_to_database('postgresql://...')

# 2. Specify what to predict (PQL - Predictive Query Language)
query = """
PREDICT customer.will_churn
FROM customer, order, product
WHERE prediction_date = '2026-04-01'
"""

# 3. Get predictions immediately (no training)
predictions = kumo_rfm.predict(database, query)

# Under the hood:
# - Reads schema: tables, columns, foreign keys, timestamps
# - Builds heterogeneous temporal graph automatically
# - Applies pre-trained relational graph transformer
# - Generates per-customer churn probabilities

# predictions['customer_id_123'].churn_probability = 0.82
# No labels needed. No model training. No feature engineering.

Zero-shot prediction: connect to database, write PQL query, get predictions. The pre-trained model handles everything.

What makes zero-shot possible

Zero-shot prediction works because relational patterns are universal:

  • Declining engagement: whether it is order frequency, login frequency, or feature usage, a declining trajectory predicts churn across industries.
  • Velocity anomalies: whether it is transaction speed, login attempts, or support ticket frequency, abnormal velocity predicts fraud or issues across domains.
  • Relational proximity: entities connected to high-risk entities are higher risk themselves. This holds for fraud networks, churn clusters, and product return patterns.
  • Temporal patterns: recency, frequency, and monetary patterns predict behavior across any transactional system.

The zero-shot to fine-tuned progression

Zero-shot is not the end state; it is the starting point:

  1. Day 1: Zero-shot - Immediate predictions. No labels needed. Good enough for ranking and prioritization. (76.71 AUROC)
  2. Week 2: Few-shot - As investigators confirm cases or analysts label examples, incorporate 10-100 labels. Performance improves incrementally.
  3. Month 1: Fine-tuned - With hundreds of accumulated labels, fine-tune the model for maximum accuracy. (81.14 AUROC)
  4. Ongoing: Continuous - As new labels arrive, periodically re-fine-tune. The model improves continuously with human feedback.

Enterprise example: new market expansion

A fintech company expands to a new country. They have no local transaction history, no local fraud labels, and no local data science team. But they have KumoRFM pre-trained on their other markets.

Zero-shot deployment:

  • Connect KumoRFM to the new market's database
  • The model recognizes the relational schema (accounts, transactions, merchants)
  • It applies universal fraud patterns learned from other markets
  • Day-one fraud scoring for every transaction, without a single local label

As the local team investigates flagged transactions and confirms fraud cases, those labels feed back into fine-tuning, progressively improving accuracy for the local market's specific fraud patterns.

Frequently asked questions

What is zero-shot prediction on graphs?

Zero-shot prediction makes accurate predictions on tasks the model was never explicitly trained on. A graph foundation model pre-trained on diverse relational databases can predict customer churn on a new database without any churn labels or task-specific training. The model transfers its general understanding of relational patterns.

How is zero-shot prediction possible?

Foundation models learn general structural patterns during pre-training: what it means for a customer's activity to decline, what transaction velocity anomalies look like, how relational proximity predicts outcomes. These patterns are universal across domains. When applied to a new database, the model recognizes these patterns without task-specific training.

How does KumoRFM achieve zero-shot predictions?

KumoRFM reads a database schema (tables, columns, foreign keys), automatically constructs a heterogeneous temporal graph, and applies its pre-trained relational graph transformer. The model was trained to predict across many different relational databases and task types. On new data, it applies this general capability. Zero-shot AUROC: 76.71 on RelBench.

When should I use zero-shot vs fine-tuning?

Use zero-shot when you need immediate predictions with no labeled data (new database, new task, no time to label). Use fine-tuning when you have some labels and need maximum accuracy. Zero-shot gives you a strong starting point (76.71 AUROC). Fine-tuning improves it further (81.14 AUROC) but requires labels and training time.

Is zero-shot prediction reliable for production?

Zero-shot predictions are reliable enough for ranking, prioritization, and exploration. KumoRFM's 76.71 AUROC outperforms task-specific LightGBM models (62.44). For high-stakes decisions (approving/denying a transaction), fine-tuning is recommended to maximize accuracy. Use zero-shot for initial deployment, then fine-tune as labels accumulate.

Learn more about graph ML

PyTorch Geometric is the open-source foundation for graph neural networks. Explore more layers, concepts, and production patterns.