Product Updates

v2.10 (06/26/2025)

Version 2.10 delivers over 50 updates across data ingestion, prediction accuracy, system stability, and UI polish.

Functionality

Show Functionality

Improved Static LP Configuration Support
Advanced users can now define static link prediction modes via ModelPlan, enabling better control over leakage and modeling strategies.

Optimization

Show Optimization

S3 to BigQuery Export Optimization
Batch prediction results are now streamed from S3 to BigQuery to improve performance and prevent memory-related job failures.

Enhancement

Show Enhancement

#ingestion & connectors

Databricks Connector Guardrails
The system now prevents creation of Databricks connectors if backend support isn’t enabled, ensuring early validation and fewer runtime surprises.
Connector Credential Retention on Update
Updating a Databricks connector now preserves existing credentials, fixing a bug that previously caused silent prediction failures.
Table Name Clean-Up and Collision Prevention
Table names with repeated underscores are automatically normalized, and manual entry with multiple underscores is now blocked.
Graph Link Suggestions After Table Edit
Editing a table node within a graph now immediately triggers updated graph link suggestions, reducing friction in graph setup workflows.
Clear Feedback for Table Graph Errors
Improved graph editing behaviors now include cleaner warnings, better name overflow handling, and consistent layout.

#training & evaluation

Trainer Stability for Large Embedding Tables
Trainer now exits gracefully when encountering large right-hand-side (RHS) tables that would previously cause CUDA out-of-memory crashes.
PQ Validation Robustness with Multi-word Keywords
Autocomplete logic in the predictive query editor has been upgraded to correctly suggest multi-word keywords like FOR EACH, avoiding syntax corruption.
Support for Multi-Class Targets with Empty Strings
Empty-string rows are now excluded from multi-class classification tasks, resolving CUDA indexing errors and ensuring class label consistency.
Improved Error Messaging for Static Node Prediction
Predictive query errors now provide helpful suggestions (e.g., use LAST() or aggregate values) when multiple matches are detected per entity.
Better Split Statistics in TimeSplit
The split stats logic now aligns with offset-based boundaries and non-strict modes, producing accurate distribution metrics across partitions.

#predictivequery

Validation Skipped for Legacy Trained Queries
Trained models with deprecated or renamed connectors can now be reviewed in the UI without triggering connector re-validation.

#batch prediction

Prediction Anchor Time Consistency
The Prediction Overview now accurately displays the backend-provided anchor time value, removing timezone-related confusion during batch prediction reviews.

Improved Prediction on Non-Empty Columns
Users now receive actionable guidance when a prediction job fails because the prediction column is already populated.

v2.9 (06/12/2025)

Version 2.9 delivers over 40 fixes and improvements across ingestion, training, evaluation, and user experience.

Functionality

PQuery End Time Support: Predictive query now supports use of end_time_col. For instance, users can write a query with “WHERE USER_LIST.WITHDRAW_DATE > 2011-10-25” where WITHDRAW_DATE is the end_time_col.

Optimization

Show Optimization

Predictive Query Validation Latency Reduction: Internal PQ validator now runs significantly faster under concurrent load, with latency reduced up to 25% across 20-100 parallel jobs.
Connector API Call Reductions: Eliminated redundant connector API calls during Prediction creation and Graph View loads to improve page responsiveness.

Enhancement

Show Enhancement

#ingestion & connectors

Databricks Connector Creation Guardrails: UI now blocks creation of multiple Databricks connectors to avoid backend stability issues.
S3 Path Copy Button: Users can now easily copy S3 source paths directly from connector detail panels.

#training & evaluation

Primary Key Auto-Suggestion: The system suggests likely primary keys based on column heuristics when users omit this field during table creation.
Adaptive Sampling Neighbor Validation: Improved validation logic estimates safe node counts based on NumNeighbors and sampling config to prevent GPU OOM failures during training. Users receive clearer warnings when configurations exceed safe thresholds.
Negative Weight Validation: Training jobs are blocked if weight columns contain negative values, preventing invalid model configurations.
Improved Metrics Table Display: The evaluation metrics table now uses the new UI Catalog Datatable for improved readability, displays baseline model scores for easier comparison, and clearly shows macro-averaged metrics like F1.
Prediction Parallel Worker Limit Removal: UI no longer enforces hardcoded parallel worker limits; backend config fully controls allowable values.

#predictivequery

Improved Error Readability: PQ validator now displays human-readable operators (e.g., >) instead of internal enum names (e.g., RelOp.GT).
PQ Warning UX: PQ validation warnings are now surfaced even when queries are otherwise valid, ensuring users receive early guidance without blocking execution.

#UI/UX polish

List Tooltip Behavior: Tooltips only show on truncated list items, eliminating unnecessary hover pop-ups.
Graph View Auto-Layout Fixes: Dragged table positions in Graph View no longer auto-reset after idle refreshes.

v2.8 (05/30/2025)

Version 2.8 delivers more functionality, optimization, and overall enhancements.

Functionality

Show Functionality

COUNT(fkey) Allowed in PQuery: COUNT operations on foreign keys are now permitted in grouping clauses.

Optimization

Show Optimization

Faster Graph Snapshot: Batched metadata retrieval significantly reduced /snapshots/graph/id latency.
Databricks Row Count Optimization: Now uses warehouse queries instead of downloading files, improving performance.

Enhancement

Show Enhancement

#ingestion

Local File Upload Enhancements: Improved progress, formatting, preview, and connector auto-refresh after uploads.
Connector Quotas & Limits UI: Displays connector-specific quotas and file upload rules in the UI.
Duplicate Table Flow Fix: Clears validation errors after duplicating a table.
Create Time Column Message: Clarified error messaging for invalid timestamp types in new tables.
S3 Region Validation: Display error when S3 buckets are in a different region as the instance.
Bad Timestamp Handling in Spark: Invalid timestamps are now mapped to null during ingestion.

#training

F1 Score Display: Added F1@0.5 to model summary and dynamic row in Threshold Metrics.
Block fkey Prediction in PQuery: Prevents selecting entity primary key as a prediction target.
Detailed Static Node Errors: Validation now explains missing FK path and incorrect targeting more clearly.
Weight Column Validation: Display errors on invalid weight values.

#xai

XAI Frontend Updates: Enhanced subgraph labeling, aggregation display, and formatting for explainability views.
Conditional Tab Display: “Subgraph Summary” tab is now hidden when empty and deep links reroute appropriately.

#batchprediction

Empty DataFrame Handling in Parquet: Writes valid snappy.parquet files for empty DataFrames.

v2.7 (05/16/2025)

Version 2.7 delivers major improvements to prediction reliability, graph editing workflows, and UI consistency. It delivers over 70 fixes and enhancements across predictive modeling, training stability, connector flows, and error messaging.

Improvements

Show Improvements

Graph creation blocked if no primary key: Graph builder now prevents creation when no primary or foreign key is detected, ensuring better data integrity during setup.
Snowflake access control respected: Users now see only the Snowflake tables they have read access to, reducing ingestion-related permission errors.
Automatic renaming of unnamed columns: Local uploads automatically rename empty or generic column headers (e.g., “Unnamed: 0”) to avoid ingestion failures.
Improved SDK reliability on Snowflake session expiry: The SDK now automatically retries generate_prediction_table() when a Snowflake session expires, preventing job crashes.
Connector errors converted into readable UI failures: Missing or renamed source files now surface clear error messages in the UI instead of returning generic server errors.
Model plan conflict warning for channels and aggregation: Users are now alerted when conflicting legacy and scoped configuration fields are defined in the same model plan.
Improved empty state experience across the UI: Legacy placeholder views were replaced with polished UI Catalog components across connectors, tables, models, and predictions.
Clearer error messaging for failed jobs: Job failure messages now specify the exact failed stage (e.g., Graph Snapshot, Prediction Table), making it easier to identify issues.
Standardized input components across product: Legacy form fields were updated to use consistent UI Catalog components such as TextInput, TextArea, and Label across multiple workflows.
Reduced snapshot polling frequency: Snapshot polling was reduced from every 2 seconds to every 30 seconds to improve system performance and reduce backend load.
Training job settings no longer duplicated: Training configuration details are now shown only once in the job details view, improving visual clarity.
Graph snapshot histogram charts restored: A prior patch was reverted to restore consistent rendering of histogram charts in the Graph Snapshot view.
Tag filtering restored in V1 UI: Tag filters in the prediction jobs list now reliably return matching results.

Deprecation

Show Deprecation

Multi-file upload rolled back: The “Add Table” button in Local Upload was removed to preserve a linear and streamlined onboarding experience; multi-upload is no longer supported in the connector panel.

v2.6 (05/09/2025)

Version 2.6 brings a rich mix of platform stability enhancements, error message clarity improvements, and UI polish across the Kumo experience. This release delivers over 50 improvements across graph building, model training, prediction, and table ingestion.

Improvements

Show Improvements

TimeSplit training logic and timestamp consistency: TimeSplit models now correctly apply boundary logic, and selected timestamp units are preserved across the pipeline to ensure reliable splits, fingerprints, and cache behavior.
Improved timestamp handling and recognition: Users can now assign Timestamp types to string fields during table registration, and anchor times are displayed consistently in both UI and logs, eliminating manual fixes and formatting mismatches.
Clearer error messaging across workflows: Users now see precise messages for connector path issues, missing or renamed files, type mismatches in graph validation, and filtered training datasets with zero rows.
UI enhancements to graph creation and display: Placeholder graph names have been removed in favor of manual naming with validation, and graph creation now blocks if no primary/foreign key is detected to ensure data integrity.
Batch prediction improvements: Prediction jobs now display custom tags, clearly identify UI-triggered jobs, exclude training weights to avoid memory overload, and hide ROC curves in degenerate cases for clarity.
Consistent and polished input and list views: Form components and list pages have been updated for consistency in spacing, styling, and layout, and dropdowns were reorganized to surface high-priority actions like “Create New Graph.”
Stable handling of schema and data changes: The UI now gracefully handles schema edits such as column renaming or removal, and ingestion jobs properly report on deleted files or empty row counts with actionable feedback.
Parquet and CSV handling refinements: Header logic is now correctly scoped to file type, .parquet folder names no longer cause errors, and string-based column preview widths remain stable during data load.
S3 and connector performance improvements: S3 file discovery supports max_items to avoid full scans, and updated connector panel messaging provides better guidance during empty or misconfigured states.
Improved subgraph and explainability display: The XAI subgraph view now offers more complete and stable rendering of model behavior, improving interpretability.
Miscellaneous usability and editor updates: YAML editors now support tab indentation, type names are human-readable in error messages, and dropdown behaviors and job tagging are more consistent across the UI.

v2.5 (04/25/2025)

This release enhanced predictive query capabilities by upgrading Predictive Query Language from v1 to v2. This release also delivered substantial workflow optimizations by integrating new temporal split functionality and enhancing job management

Improvements

Show Improvements

Integrated UI Catalog components into Kumo: Unified the app’s visual language by integrating Sidebar, ListOverview, status cards, and table search for a more consistent user experience.
Generate Prediction button added to key views: Enabled quick access to prediction workflows from the Prediction List and Training Job Overview pages.
Improved job orchestration for predictions: Enhanced logic for using parent sources, waiting on child jobs, and propagating job failure states correctly.
Reduced excessive graph snapshot refreshes: Prevented redundant refreshes to reduce load and improve responsiveness in graph workflows.
Async validation in graph workflows: Moved table.validate() into an async graph.validate() pipeline to improve performance and prevent blocking.
Timestamp unit handling standardized: Inferred and preserved timestamp units across ingestion, graph refresh, and execution to ensure consistency.
Split stats UI fixed and improved: Resolved broken displays in split table stats, restoring clarity and usability for time-based training workflows.
Enhanced XAI subgraph and score display: Fixed local score display and improved completeness of subgraph views for explainability workflows.
Batch prediction tagging and metadata exposed: Surfaced custom tags and UI-sourced job indicators to improve traceability in experiments.
Prediction query engine enhancements: Added AutoML pruning, adaptive sampling, RFM query flag, and simplified syntax (e.g., “IN” over “IS IN”).
Improved error messaging throughout the product: Delivered clearer messages for missing PKs, dtype mismatches, empty files, and normalized headers.
Robust handling of renamed/missing columns: Enabled stable behavior in workflows when table schemas change mid-pipeline.
Snowflake and S3 integration improvements: Enhanced error logging, credential handling, and S3 directory scanning for better reliability and performance.
Removed deprecated model training APIs: Finalized migration to kumo-ml, added Trainer(checkpoint_path) support, and dropped legacy task dependencies.
Updated Optuna and Databricks compatibility: Upgraded to Optuna 4.3.0 and updated Databricks table versioning to align with current backend infrastructure.

Deprecation

Deprecate Predictive Query Language v1

v2.4 (04/18/2025)

New Features

Updated UI for graph-level and entity-level explanations, including new features like top features and subgraph summaries

Improvements

Improved error messaging across multiple workflows and the PQL editor

v2.3 (03/28/2025)

New Features

Introduced graph transformer architecture in Model Plan
Added support for weighted training tables, allowing users to assign weights to datasets for more flexible and tailored training processes

Improvements

Enhanced UI, including fixes for admin dashboard and connector management

Bug Fixes

Fixed GPU out-of-memory errors by adjusting resource allocation during operations
Resolved issues with unnamed S3 connectors in the UI

v2.2 (03/14/2025)

New Features

Introduced start_time, end_time, and time_split for predictive query, enhancing temporal analysis capabilities.

Improvements

Significant UI updates, including enhanced search and error states
Auto-generated suggestions for table and graph names
Improved handling of extreme timestamps in training modules

Bug Fixes

Resolved connector page issues for SPCS and Databricks

v2.1 (02/28/2025)

New Features

Added support for behavioral recommendation metrics, such as, coverage, average popularity, diversity and personalization.
Create graphs on the fly when creating New Models.

Improvements

Improved stability of XAI

Bug Fixes

Fixed a crucial bug in which majority_sampling_ratio was not working as expected

v2.0 (02/14/2025)

We’re thrilled to announce a major update to the Kumo platform, featuring a fully redesigned interface and . This update brings a more streamlined, modern look and feel to the platform—making it easier for ML engineers and data scientists to train and run models.

What’s New:

Redesigned Navigation & Layout. A cleaner layout and intuitive navigation bar help you find the right features faster—reducing clicks and saving you time when setting up data connections or reviewing model outputs.
Powerful new Python SDK. Designed to use Kumo in your favorite IDE or notebook, seamlessly integrate with the UI, enabling robust, flexible, and interoperable workflows between code and visual interactions. SDK Reference.
Enhanced Workflows. An intuitive expeirence to help select graphs and train models faster with quicker iterations.

Improvements

Reduced the time between AutoML trials to a minimum, significantly speeding up execution, especially for workflows with many trials.
Improved encoding efficiency: Raw data is now encoded and hashed upon graph snapshotting, leading to improved security and faster execution.
Relative time is now computed for all timestamp columns, independent of whether they were assigned as a designated time column.
Kumo can now gracefully handle timestamps outside of UNIX/int64 range
Introduce job queuing for individual workflow like training table generation, prediction table generation, etc.

Bug Fixes

Fix bugs for concurrent table ingestion workflow.

v1.47 (01/31/2025)

New Features

Supporting explainability in batch predictions.

Improvements

Improved stability of Explanations of models
More robust encoder logic for highly skewed numerical distributions
Fixed artifact export to Snowflake and DB.
Improve memory efficiency for global baselines.
Improved health check for concurrent jobs.

v1.46 (01/17/2025)

Features

Forecasting: added year-over-year and handle_new_entities option labels to support learning seasonal and holiday trends.
Improved syntax error messaging in Model Plan for better clarity.
Extended support for long-duration training jobs and batch predictions (2 to 20 days).

Bug Fixes

Resolved timestamp data type casting issues. Users no longer need to specify ts_format or unit for affected datasets.
Corrected estimated prediction times for large output sizes to improve accuracy.

v1.45 (01/06/2025)

Features

Kumo can now be run as a Native app on Snowflake Azure regions.

Bug Fixes

Resolved bar graph display issues within the Subgraph table.

Breaking Change

Reduced download limit of holdout dataset to 1M entities.

Changelog

​v2.10 (06/26/2025)

​Functionality

​Optimization

​Enhancement

​#ingestion & connectors

​#training & evaluation

​#predictivequery

​#batch prediction

​v2.9 (06/12/2025)

​v2.8 (05/30/2025)

​v2.7 (05/16/2025)

​v2.6 (05/09/2025)

​v2.5 (04/25/2025)

​v2.4 (04/18/2025)

​v2.3 (03/28/2025)

​v2.2 (03/14/2025)

​v2.1 (02/28/2025)

​v2.0 (02/14/2025)

​v1.47 (01/31/2025)

​v1.46 (01/17/2025)

​v1.45 (01/06/2025)

v2.10 (06/26/2025)

Functionality

Optimization

Enhancement

#ingestion & connectors

#training & evaluation

#predictivequery

#batch prediction

v2.9 (06/12/2025)

v2.8 (05/30/2025)

v2.7 (05/16/2025)

v2.6 (05/09/2025)

v2.5 (04/25/2025)

v2.4 (04/18/2025)

v2.3 (03/28/2025)

v2.2 (03/14/2025)

v2.1 (02/28/2025)

v2.0 (02/14/2025)

v1.47 (01/31/2025)

v1.46 (01/17/2025)

v1.45 (01/06/2025)