Skip to main content
Predictive Query Language (PQL) is a declarative query language that lets you define predictive problems on relational data. PQL specifies:
  1. The target — what you want to predict (an aggregation or column value)
  2. The entity — which rows/IDs to predict for
  3. The horizon — the future time window to predict over (for temporal tasks)
For the full thorough introduction to predictive query, please refer to the predictive query tutorial.
KumoRFM is currently in experimental phase. Some PQL features are not fully supported yet.

Target x Entity x Horizon

The core framework for every KumoRFM prediction is Target x Entity x Horizon: Target x Entity x Horizon framework (Placeholder: Diagram showing how Target, Entity, and Horizon combine to define a prediction.)
  • Target: The value to predict — either an aggregation over related rows (e.g., COUNT(orders.*, 0, 30, days)) or a static column value (e.g., users.age).
  • Entity: The specific row(s) to predict for, identified by a table’s primary key (e.g., users.user_id=1).
  • Horizon: For temporal predictions, the future time window (e.g., 0, 30, days means “the next 30 days from now”).

PQL Structure

The general PQL structure is:
PREDICT <target_expression> FOR <entity_specification> WHERE <optional_filters>
ComponentPurpose
PREDICT <target_expression>Declares the value or aggregate the model should predict
FOR <entity_specification>Specifies the single ID or list of IDs to predict for
WHERE <filters> (optional)Filters which historical rows are used as context

Five Steps to Write a PQL Query

  1. Choose your entity — pick a table and its primary key to predict for.
  2. Define the target — a raw column or an aggregation over a future window.
  3. Pin the entity list — pass a single ID or multiple IDs.
  4. (Optional) Refine the context — add filters to restrict which historical rows are used for feature generation.
  5. Run & fetch — call KumoRFM.predict() or KumoRFM.evaluate().

Entity Specification

Unlike the fine-tuning mode, KumoRFM makes predictions for a handful of selected entities at a time. Entities can be specified in three ways:
  • Single ID: users.user_id=1
  • Tuple of IDs: users.user_id IN (1, 2, 3)
  • Programmatic list via the indices parameter:
result = model.predict(
    "PREDICT COUNT(orders.*, 0, 30, days) > 0 FOR users.user_id=1",
    indices=[1, 2, 3, 4, 5],
)

Example Queries

Temporal regression — predict total spend in the next 30 days:
PREDICT SUM(orders.price, 0, 30, days) FOR users.user_id=42
Binary classification — will a user churn (no orders in 90 days)?
PREDICT COUNT(orders.*, 0, 90, days) = 0 FOR users.user_id=42
Static prediction — predict a user’s age from relational context:
PREDICT users.age FOR users.user_id=42
Multi-horizon forecasting — predict weekly revenue over 8 weeks:
PREDICT SUM(orders.price, 0, 7, days) FORECAST 8 TIMEFRAMES FOR items.item_id=42
See prediction_types for a complete reference of all supported task types.

Unsupported Features

Due to the experimental nature of KumoRFM, some PQL features are not yet fully supported:
  • LIST_DISTINCT() without a time interval is not supported.
  • Filtering by column value (e.g., WHERE users.age > 21) is only supported for columns within the same table.
  • Predicting a single non-aggregated value (e.g., PREDICT users.age) only works for columns within the entity table.

Further Reading

  • prediction_types — all supported task types with PQL examples
  • filters_and_operators — WHERE, IN, logical operators, anchor time
  • evaluation — automatic evaluation and metrics
  • configuration — run modes, explainability, batch mode, retry