Batch Predictions
Overview
Once you are satisfied with the performance of your predictive query on historical data, you can generate batch predictions. This is done by navigating to New > Prediction and selecting a model to run.
Creating a New Batch Prediction
To run batch predictions:
-
Navigate to New > Prediction.
-
Select a trained predictive model.
-
(Optional) Adjust predictive query filters to apply target entity filtering.
-
Configure batch prediction settings (anchor time, output destination, etc.).
-
Submit the batch prediction job.
Applying Filters at Prediction Time
After training, you may want predictions for a specific subset of entities. Kumo allows you to:
-
Filter target entities by refining the dataset used for batch predictions.
-
Set a prediction anchor time to specify a custom evaluation window.
Applying filters at batch prediction time helps:
-
Improve efficiency by reducing the amount of data processed.
-
Streamline output by limiting predictions to relevant business logic.
Configuring Batch Prediction Settings
Prediction Anchor Time
Set an optional prediction anchor time in ISO 8601 format (e.g., 2024-02-27
). If left blank, Kumo defaults to the latest timestamp in the fact table.
Output Destination
Specify where predictions should be stored. Available destinations:
-
AWS S3 (CSV, Parquet, or partitioned Parquet format)
-
Snowflake (overwrites existing table rows)
-
BigQuery (appends predictions to an existing table)
-
Local Download (sample output up to 1GB)
Parallel Processing
Specify the number of parallel workers (up to 4) to speed up batch predictions for large datasets.
Output Type
Choose the type of output:
-
Predictions - The predicted target values for the selected entities.
-
Embeddings - Numerical vectors of entities capturing their behavioral patterns.
Running and Monitoring Batch Predictions
Once configured, click Start Predicting to launch the batch prediction job.
You will be redirected to the batch prediction job details page, where you can monitor progress and download output samples.