Build AI Models for Relational Data
February 6, 2025Vanja Josifovski
Introducing AI models for relational data through the updated Kumo platform. Now, you can eliminate tedious feature engineering, accelerate data science by up to 20x, and harness your most important business data — like transactions, customers, and inventory — for powerful AI workloads. At Kumo, our mission is simple: Make the most important data also the most useful.
The Relational-Data Gap in AI
It’s astounding: several years into the AI revolution, the breakthrough innovations in language models have left relational data—the backbone of every company’s operations—on the sidelines. While text-based applications have soared thanks to LLMs, predictive and generative AI built on structured business data have been held back by outdated methods. For too long, data scientists and ML engineers have spent countless hours on feature engineering, struggling to coax meaningful insights out of relational data.
Our team, with deep roots as engineering and science leaders at Pinterest, Airbnb, and Stanford, recognized this disparity and set out to solve this, eventually leading to two key breakthroughs:
- Graph Transformers: Rather than training models solely on text, Graph Transformers (or Graph Neural Networks) train on graphs in order to predict the next you harness the rich interconnections within data. But there’s a catch: These models require your data to be in a graph format, and most companies don’t have that luxury out of the box. Learn more about graph transformers.
- Relational Deep Learning (RDL): We solved this challenge by developing RDL, a method that converts your relational data into a graph suitable for graph transformers. We made it available through an open-source library, PyG, which today is used by NVIDIA, Spotify, Adyen, Lyft, Airbus, and over 20,000 others. Learn more about RDL.
Our benchmark showed this approach cuts the time required to developing a predictive model by 95% when compared to an expert data scientist using traditional ML methods — a 20x productivity boost. Learn more about RelBench.
This was tremendous progress but like most open-source solutions it still came with a learning curve and the need to set up and maintain specialized infrastructure. There was one more thing left to build…
The Kumo platform: Build and run AI models for relational data
Kumo is the platform that empowers data scientists and engineers to build state-of-the-art AI models directly on their relational data — without any feature engineering. With Kumo, you simply:
- Connect Your Data: Link your data sources, select your tables, and define relationships effortlessly.
- Train Models: Pick your predictive task and let our platform handle the heavy lifting using advanced RDL techniques with specialized graph-based storage and GPUs, all under the hood and done for you.
- Evaluate & Run: Instantly evaluate model quality, understand key contributors, and deploy predictions or embeddings at scale.
Today, the Kumo platform is already used by teams at companies like Reddit, Doordash, iFood, Sainsbury’s, Chime, and many others to develop models for business-critical applications such as fraud detection, risk scoring, recommender systems, and entity resolution.
The impact they’re seeing is profound:
- Data scientists get to do what they’ve always dreamed of doing — shipping accurate and impactful models across the business — instead of spending months on extensive data prep and feature engineering. It also brings them to the frontier of AI, an exciting and rejuvenating place to be!
- Engineers can bring AI to the critical use cases and applications that rely on relational data — no longer constrained by language models or limited to conversational use cases.
- AI and engineering leaders get to show a greater and faster return on AI and ML investments, while reducing complexity and operational cost.
Today we’re releasing three major updates to make adoption of Kumo models even easier:
New UX: A Refreshed Developer Experience
Our brand-new SDK, refreshed UI, and comprehensive documentation bring the power of Kumo to every data scientist and developer, regardless of your background in graphs or traditional ML. With an intuitive, streamlined experience, you can now connect your data, train a model, and start generating predictions or embeddings in about the time it takes to brew coffee.
import kumoai as kumo
# Write a Predictive Query over your business data:
graph = kumo.Graph(...)
pquery = kumo.PredictiveQuery(
graph=,
query=(
"PREDICT MAX(transaction.Quantity, 0, 30) "
"FOR EACH customer.CustomerID "
"ASSUMING SUM(transaction.UnitPrice, 0, 7, days) > 15"
),
)
# Create a `Trainer` from a Kumo-suggested model plan, and
# train a model:
trainer = kumo.Trainer(model_plan = pquery.suggest_model_plan())
training_job = trainer.fit(
graph=graph,
train_table=pquery.generate_training_table(),
)
# Predict using your trained model:
prediction_job = trainer.predict(
graph=graph,
prediction_table=pquery.generate_prediction_table(),
output_types={'predictions', 'embeddings'},
output_connector=connector,
output_table_name='kumo_predictions',
training_job_id=training_job.job_id,
)
This new UX removes the complexity — so you can focus on the insights rather than the infrastructure.
Explainability: Trust & Understand Your Models
We understand that deploying AI models isn’t just about speed — it’s also about trust. Our enhanced explainability features let you see exactly what your model is learning. Quickly identify the key contributors in your data, and explain model behavior confidently to your team or management. Now, your predictive models are not a black box but a transparent engine powering your business decisions.
Imagine reviewing a model’s decision process with clear, actionable insights—empowering you to fine-tune and iterate faster than ever.
Free Trial: Experiment with Kumo Today
We’re launching a free trial available on both our Cloud platform and as a Snowflake Native App (SPSC). Start experimenting with no commitments. For Snowflake customers, you can even work directly within your secure environment. Join the waitlist.
Get Started
Kumo is available today via multiple deployment options:
- SaaS (Fully Managed)
- Snowflake Native App
- Databricks Lakehouse App
Contact our sales and solutions team to set up a demo or workshop tailored to your use case, or dive straight in with a trial and our example notebooks.
What’s Next
We’re only at the beginning of this journey. As we continue to refine and expand Kumo, expect even more innovations that make it faster and easier to turn your relational data into impactful AI applications. We’ll keep publishing cutting-edge research, hosting workshops, and sharing best practices to help you drive business value.
And if you’re passionate about transforming how companies leverage their data, we’re hiring scientists and engineers — apply today and join us on the forefront of the AI revolution.
At Kumo, we believe your data needs no explanation. Our breakthrough platform puts the power of AI into the hands of every data scientist and developer, ensuring that your most important data becomes your most useful asset. Welcome to a future where AI models truly connect the dots.