Querying RFM - Kumo.ai

Predictive Query Language (PQL) is a declarative query language that lets you define predictive problems on relational data. PQL specifies:

For a thorough introduction to predictive queries, please refer to the predictive query tutorial.

KumoRFM is currently in experimental phase. Some PQL features are not fully supported yet.

Target x Entity x Horizon

The core framework for every KumoRFM prediction is Target x Entity x Horizon:

Target: The value to predict — either an aggregation over related rows (e.g., COUNT(orders.*, 0, 30, days)) or a static column value (e.g., users.age).
Entity: The specific row(s) to predict for, identified by a table’s primary key (e.g., users.user_id=1).
Horizon: For temporal predictions, the future time window (e.g., 0, 30, days means “the next 30 days from now”).

The general PQL structure is:

PREDICT <target_expression> FOR <entity_specification> WHERE <optional_filters>

Component	Purpose
`PREDICT <target_expression>`	Declares the value or aggregate the model should predict
`FOR <entity_specification>`	Specifies the single ID or list of IDs to predict for
`WHERE <filters>` (optional)	Filters which historical rows are used as context

Choose your entity — pick a table and its primary key to predict for.
Define the target — a raw column or an aggregation over a future window.
Pin the entity list — pass a single ID or multiple IDs.
(Optional) Refine the context — add filters to restrict which historical rows are used for feature generation.
Run & fetch — call KumoRFM.predict() or KumoRFM.evaluate().

Unlike the fine-tuning mode, KumoRFM makes predictions for a handful of selected entities at a time. Entities can be specified in three ways:

result = model.predict(
    "PREDICT COUNT(orders.*, 0, 30, days) > 0 FOR users.user_id=1",
    indices=[1, 2, 3, 4, 5],
)

Temporal regression — predict total spend in the next 30 days:

PREDICT SUM(orders.price, 0, 30, days) FOR users.user_id=42

Binary classification — will a user churn (no orders in 90 days)?

PREDICT COUNT(orders.*, 0, 90, days) = 0 FOR users.user_id=42

Static prediction — predict a user’s age from relational context:

PREDICT users.age FOR users.user_id=42

Multi-horizon forecasting — predict weekly revenue over 8 weeks:

PREDICT SUM(orders.price, 0, 7, days) FORECAST 8 TIMEFRAMES FOR items.item_id=42

See prediction_types for a complete reference of all supported task types.

Due to the experimental nature of KumoRFM, some PQL features are not yet fully supported:

LIST_DISTINCT() without a time interval is not supported.
Filtering by column value (e.g., WHERE users.age > 21) is only supported for columns within the same table.
Predicting a single non-aggregated value (e.g., PREDICT users.age) only works for columns within the entity table.