Databricks - Kumo

About Kumo AI and Databricks

Kumo is a predictive AI solution that accelerates the creation and performance of predictive models from datasets within Databricks Unity Catalog. Kumo combines graph neural networks (GNN) with large language models (LLMs) to learn across all data in the warehouse. Kumo makes highly accurate predictions about segments, lifetime value, behaviors, and more that helps drive business objectives.

Leveraging its internal GNN models, Kumo learns across multiple Databricks tables in customer’s Unity Catalog and builds accurate predictions. The GNN models can be further improved by domain expert models, such as the encoder-based LLMs. With Kumo + Databricks, customers can leverage a diverse collection of LLMs(e.g. DBRX, HuggingFace, and OpenAI to name a few) through Databricks Model Serving to bring general-world knowledge to customer’s relational data via Kumo’s GNN with LLM model.

Kumo’s predictive models can be further refined by domain experts for maximum performance improvements. In hours, Kumo generates batch predictions or embeddings for use downstream.

30% more accurate

The out-of-the-box models created in hours by Kumo using graph neural networks and LLMs are up to 30% more accurate than baselines.

Compatibility

Compatibility with a diverse collection of Databricks supported LLM frameworks and libraries offers flexibility in choosing the right tools for specific tasks.

Data privacy

No data is stored on disk in a Kumo owned environment. Data leaving Databricks is transformed, encoded, and deleted after use.

Predictive AI workflow with Kumo + Databricks

Kumo operates directly on raw Databricks tables, utilizing Unity Catalog for data governance and security, generates predictions at scale, and writes them back to your Databricks warehouse.
The data warehouse native deployment for Databricks (in private preview) ensures that no data is stored on disk in a Kumo owned environment and any data leaving Databricks is transformed, encoded, and deleted after use.
Reliably and securely create predictive AI to make predictions without data ever being stored outside your Databricks warehouse.

Seamless integration

Build your graph once by connecting your Databricks Unity Catalog tables, then use it to generate numerous predictions for various use cases. Kumo’s automated ML pipelines keep models up-to-date, ensuring deep learning from the latest data.

Scalable

Kumo is built for scale and operates on terabyte-sized tables in Databricks Unity Catalog. There’s no need to sample or reduce data when training, ensuring comprehensive and accurate predictions.

Accurate

Kumo’s AI learns from the latest data in Databricks Unity Catalog at your chosen interval, ensuring that predictions are always as accurate as possible

Versatile deployment

Kumo serves real-time model inference or batch predictions, which can be written back to Databricks Unity Catalog or into a key-value store for real-time serving.

Experience the Future: Kumo & Databricks Demo

Benefits of Kumo + Databricks

Deliver More Predictions

Build your graph once by connecting your Databricks Unity Catalog tables, then use it to generate numerous predictions for various use cases. Kumo’s automated ML pipelines keep models up-to-date, ensuring deep learning from the latest data.

Improve Model Performance

Leveraging Databricks Model Serving and its support for LLMs, Kumo combines GNNs with LLMs to learn from both structured and unstructured data in the warehouse, producing highly accurate predictions. Kumo’s prediction results enhance the accuracy of existing models by feeding them trained embeddings, allowing data scientists to add value at each step.

Secure Data Processing

Ensure rapid model deployment and processing while keeping your data secure within your Databricks Unity Catalog. Advanced encoding measures ensure your data and models are protected throughout the ML lifecycle.

Deep learning and explainable AI

Relational data stored in Databricks Unity Catalog tables hold value that is hard to unlock using traditional methods. To overcome this challenge, Kumo represents your tables as a graph structure and applies graph machine learning directly on the raw data tables.
Kumo’s GNN leverages the natural relational structure of Databricks tables, enhanced with additional context from unstructured data through LLM models supported by Databricks. This integration maximizes signal and significantly improves the accuracy and performance of predictive AI models for advanced machine learning solutions.
Kumo provides detailed model evaluation, model explainability, MLOps, and other explainability features to build trust and monitor models in production. Explanations down to the raw data mean you can better understand how various columns affect every prediction, how inputs contribute to a specific prediction, and identify relevant trends hidden in raw input data.

Kumo + Databricks Architecture

Build, Deploy, and Secure Predictive AI Models with Databricks

About Kumo AI and Databricks

30% more accurate

Compatibility

Data privacy

Predictive AI workflow with Kumo + Databricks

Seamless integration

Scalable

Accurate

Versatile deployment

Experience the Future: Kumo & Databricks Demo

Benefits of Kumo + Databricks

Deliver More Predictions

Improve Model Performance

Secure Data Processing

Deep learning and explainable AI

Kumo + Databricks Architecture