2023 Product Updates

v1.26 (2023-12-18)

Summary

The pquery syntax has been updated to make it easier to understand and more flexible in the way filters can be applied.
Various minor fixes and UI improvements.

Summary

Summary

New model planner available during pQuery training allows for fine-grained control over encoders, training strategy, and the AutoML search space.
Additional model planner (previously advanced options) configuration options available.

Summary

XAI: various minor fixes and UI improvements.
XAI: metrics now available for multiclass and multilabel classification tasks.
For node prediction tasks, test data splits can now be downloaded from the Review Evaluation Metrics page.
When selecting source tables, a new raw table option is available for connecting tables that don’t conform to either fact or dimension table types.
Kumo views enable the running of traditional SQL queries that materialize a view in the Kumo data plane.

Summary

Batch predictions now include output statistics computed from a sample of table data.
Various minor fixes and UI improvements.

Summary

XAI - Cohort analysis for time columns now improved to be more interpretable.
XAI - Cohort analysis now working for tables that are two hops away from the prediction entity table.
A new refit feature enables automatic model refitting on entire data.
Descriptions can now be added and updated for any objects in the Kumo platform.
During new pquery creation, automatically re-use already materialized graphs from prior pQuery creation jobs.
A new connector is available for connecting to Google Cloud BigQuery.
For multilabel classification pQueries (e.g. using the LIST_DISTINCT() operator on a maximum of 1,000 classes), evaluation metrics now include class-specific metrics.

Summary

XAI - In Column Analysis, actual versus predicted values are now displayed per column.
A new table column type called Embedding enables the use of embeddings as an input column.
For regression pQueries predicting a numeric output (using COUNT, SUM, etc. operators), evaluation results now include scatter plot charts that display actual versus predicted values.
During pQuery training, charts and tables are now provided to show how the training example target labels used to train the pQuery vary over time and across training/validation/holdout data splits.

Summary

A “Distribution of Predictions” chart showcasing a visualization of the predicted values alongside the actual target labels for all entities in a regression task (e.g., predictive queries with COUNT() or SUM() operator).
Expose boolean advanced option to handle prediction of unseen target entities at batch prediction time for link prediction tasks.
Creating custom Kumo Views using SQL queries on top of tables already connected to the platform.
Enable kicking off up to 10 asynchronous jobs (training/batch prediction) that will get queued and run sequentially one after another as older jobs complete.
Enable concurrent execution of more than 1 job.

Summary

A plot showcasing the distribution of values for timestamp columns for validating while ingesting new tables.
S3 CSV data sources supported as connectors.
Calibrating batch predictions for classification tasks using Platt Scaling.
Parallelize batch prediction jobs involving large dataset size on multiple workers (up to 4).
XAI - Explaining how the underlying data contributes to the final predictions.
- Contribution score of individual tables and the columns within them.
- Cohort analysis for the range of values of each column and for the range of number of historic facts available in tables.
Miscellaneous minor UX flow, bug, predictive accuracy fixes.