Research/ Graph Transformers /

RelBench: A Benchmark for Deep Learning on Relational Database

February 5, 2025

Matthias Fey

Relational databases are the backbone of structured data in countless industries, from finance to healthcare to e-commerce. They store information across multiple interconnected tables linked by primary-foreign key relationships. Yet, when it comes to applying machine learning (ML) to this data, we’ve long relied on a cumbersome and manual process: joining tables, engineering features by hand, and training traditional models on the resulting flattened tables. This approach is time-intensive, error-prone, and often leaves valuable relationships unutilized.

To address these challenges, our latest research introduces RelBench, a public benchmark designed to evaluate the performance of machine learning methods — particularly Relational Deep Learning (RDL) — on predictive tasks over relational databases. RelBench offers a fresh lens for tackling relational data, combining graph-based learning techniques with powerful tabular models to unlock the full potential of primary-foreign key links.

What is RelBench?

RelBench is the first public benchmark to systematically study predictive tasks over relational databases using Graph Neural Networks (GNNs). It provides datasets and tasks across a variety of domains and scales, from small relational structures to large, complex schemas. Our goal was to create a foundational infrastructure that researchers and practitioners can use to evaluate, compare, and improve ML approaches on relational data.

Relational Deep Learning: A New Paradigm

At the heart of RelBench is a paradigm shift: instead of manually engineering features and relying solely on tabular models, Relational Deep Learning (RDL) integrates GNNs with deep tabular models. Here’s how it works:

Graph Representation: Relational data is transformed into a graph, with each row in a table represented as a node and primary-foreign key links as edges.
End-to-End Learning: GNNs process the graph to learn representations that capture relationships between entities, while tabular models extract initial representations directly from raw tables.
Automation: Unlike traditional methods, RDL eliminates the need for manual joins or feature engineering, enabling the model to fully exploit the relational structure of the data.

This approach is fundamentally different from the dominant paradigm of manually flattening relational data into a single table for ML. By directly leveraging the relational structure, RDL captures predictive signals that would otherwise be lost in the flattening process.

Key Findings: RDL vs. Manual Feature Engineering

To thoroughly evaluate RDL, we conducted a detailed user study where an experienced data scientist manually engineered features for each predictive task. The results were striking:

Better Performance: RDL consistently outperformed traditional tabular models with manually engineered features, leveraging the full richness of relational data.
Significant Time Savings: RDL reduced the amount of human effort required by more than an order of magnitude. Tasks that once took hours or days could now be completed in minutes with minimal intervention.

These findings underscore the transformative potential of deep learning for relational databases. RDL doesn’t just automate the process—it delivers superior results.

Why RelBench Matters

RelBench isn’t just a benchmark; it’s a call to action. By providing a standardized infrastructure for evaluating relational ML approaches, we aim to accelerate innovation in this space. Researchers can now build on a common foundation, testing new methods and comparing them against established baselines. Practitioners can explore cutting-edge techniques like RDL to improve their workflows and unlock new insights from their relational data.

What’s Next?

RelBench opens up exciting new possibilities for machine learning on relational databases. Moving forward, we envision several avenues for exploration:

Scalability: Adapting RDL to handle even larger and more complex datasets.
Interdisciplinary Applications: Applying RDL to domains like healthcare, where relational data and complex relationships are critical.
Tool Development: Building user-friendly tools and frameworks to make RDL accessible to a broader audience.

This is just the beginning. Relational Deep Learning represents a significant leap forward, and RelBench provides the foundation for future research and development in this field.

Final Thoughts

Relational databases have long been an untapped goldmine for machine learning. With RelBench and RDL, we’re moving beyond the limitations of manual feature engineering and unlocking the full potential of relational data. As one of the co-authors of this work, I’m excited to see how researchers and practitioners leverage these tools to push the boundaries of what’s possible with machine learning on structured data.

For more details, read the full paper: "RelBench: Benchmarking Relational Deep Learning with Graph Neural Networks".