> ## Documentation Index
> Fetch the complete documentation index at: https://kumo.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Prediction Types

> Task taxonomy: regression, classification, forecasting, ranking, and static tasks

KumoRFM supports a variety of prediction types, organized into **temporal tasks** (which involve a future time horizon) and **static tasks** (which infer attributes without a time component).

## Temporal Tasks (Forecast)

All temporal tasks predict **future outcomes** over a **defined time horizon** using **historical data** in relational tables. Every temporal prediction is defined by:

* **Target**: What is being predicted (an aggregation expression)
* **Entity**: Who the prediction is for (a table primary key)
* **Horizon**: When the prediction applies (a future time window)

The general PQL pattern for temporal tasks is:

```sql theme={null}
PREDICT <aggregation>(table.column, <start>, <end>, <unit>) FOR entity_table.pk=value
```

where `<start>` and `<end>` define the future time window relative to "now", and `<unit>` is the time granularity (e.g., `days`, `hours`, `minutes`).

### Forecast: Regression

Predict a **continuous numeric value** for an entity over a future time horizon.

**Use case**: Demand forecasting, revenue prediction, quantity estimation.

**Supported aggregations**: `SUM`, `AVG`, `COUNT`, `MAX`, `MIN`

```sql theme={null}
-- Predict total revenue for item_id=42 in the next 30 days
PREDICT SUM(orders.price, 0, 30, days)
FOR items.item_id=42
```

```python theme={null}
result = model.predict(
    "PREDICT SUM(orders.price, 0, 30, days) FOR items.item_id=42"
)
```

**Output**: Numeric value per entity. For quantile output, see [Configuration](/rfm/configuration#inference-configuration).

**Metrics**: mae, mse, rmse, mape, smape, r2.

### Forecast: Binary Classification

Predict whether an entity **will or will not** experience an event within a future time window. This is defined by applying a **boolean condition** to an aggregation expression.

**Use case**: Customer churn prediction, event occurrence prediction.

**Supported aggregations**: `SUM`, `AVG`, `COUNT`, `MAX`, `MIN` (with a boolean condition such as `= 0`, `> 100`)

```sql theme={null}
-- Predict whether user_id=42 will make zero orders in the next 90 days (churn)
PREDICT COUNT(orders.*, 0, 90, days) = 0
FOR users.user_id=42
```

```python theme={null}
result = model.predict(
    "PREDICT COUNT(orders.*, 0, 90, days) = 0 FOR users.user_id=42"
)
```

The boolean condition (`= 0`, `> 100`, etc.) on the aggregation makes this a binary classification task.

**Output**: Boolean (True/False) and probability per entity.

**Metrics**: acc, auroc, auprc, ap, precision, recall, f1.

### Forecast: Multi-Class Classification

Predict which **class or state** an entity will belong to at a future point in time. Use `FIRST()` to predict the first value that will occur in the window, or `LAST()` to predict the final value.

**Use case**: Tier migration, lifecycle stage prediction, feature engagement.

**Supported aggregations**: `FIRST`, `LAST`

```sql theme={null}
-- Predict what subscription tier user_id=42 will be in after 30 days
PREDICT FIRST(subscriptions.tier, 0, 30, days)
FOR users.user_id=42
```

```python theme={null}
result = model.predict(
    "PREDICT FIRST(subscriptions.tier, 0, 30, days) FOR users.user_id=42"
)
```

**Output**: Class label and class probabilities per entity.

**Metrics**: acc, precision, recall, f1, mrr.

### Recommendations

Predict a **ranked list of items** an entity is most likely to interact with over a future time window. Use `LIST_DISTINCT()` with `RANK TOP N` to get the top N recommended items.

**Use case**: Product recommendations, content ranking, next best action.

**Supported aggregations**: `LIST_DISTINCT` with `RANK TOP N`

```sql theme={null}
-- Predict the top 10 items user_id=42 is most likely to order in the next 30 days
PREDICT LIST_DISTINCT(orders.item_id, 0, 30, days) RANK TOP 10
FOR users.user_id=42
```

```python theme={null}
result = model.predict(
    "PREDICT LIST_DISTINCT(orders.item_id, 0, 30, days) RANK TOP 10 "
    "FOR users.user_id=42"
)
```

**Output**: Ranked list of item IDs per entity.

**Metrics**: `map@k`, `ndcg@k`, `mrr@k`, `precision@k`, `recall@k`, `f1@k`, `hit_ratio@k`.

### Multi-Horizon Regression (Forecasting)

Predict a numeric value for an entity across **multiple future time steps**. This produces a time series of predictions.

**Use case**: Multi-step demand forecasting, time series prediction.

**Supported aggregations**: `SUM`, `AVG`, `COUNT`, `MAX`, `MIN`

```sql theme={null}
-- Predict weekly revenue for item_id=42 over the next 60 weeks
PREDICT SUM(orders.price, 0, 7, days) FORECAST 60 TIMEFRAMES
FOR items.item_id=42
```

```python theme={null}
result = model.predict(
    "PREDICT SUM(orders.price, 0, 7, days) FORECAST 60 TIMEFRAMES "
    "FOR items.item_id=42"
)
```

The `FORECAST N TIMEFRAMES` clause tells KumoRFM to produce `N` predictions, each separated by the time window specified in the aggregation (7 days in this example). So `FORECAST 60 TIMEFRAMES` with a 7-day window predicts out 60 × 7 = 420 days total.

**Output**: Time-indexed numeric values (one per horizon). For quantile output, see [Configuration](/rfm/configuration#inference-configuration).

**Metrics**: mae, mse, rmse, mape, smape, r2.

## Static Tasks

Static tasks infer **latent or unknown entity attributes** without modeling temporal evolution. There is no time horizon — the prediction is about the current state of the entity based on its attributes and relational context.

Every static prediction is defined by: **Target** × **Entity** (no horizon).

The general PQL pattern for static tasks is:

```sql theme={null}
PREDICT table.column FOR entity_table.pk=value
```

### Static Regression

Infer a **continuous numeric attribute** of an entity.

**Use case**: Age estimation, price imputation, value scoring.

**Supported target type**: Numeric columns

```sql theme={null}
-- Predict the age of user_id=42
PREDICT users.age
FOR users.user_id=42
```

```python theme={null}
result = model.predict("PREDICT users.age FOR users.user_id=42")
```

**Output**: Numeric value per entity. For quantile output, see [Configuration](/rfm/configuration#inference-configuration).

**Metrics**: mae, mse, rmse, mape, smape, r2.

### Static Binary Classification

Infer whether an entity belongs to **one of two classes** based on its attributes.

**Use case**: Fraud detection, quality classification.

**Supported target type**: Boolean columns

```sql theme={null}
-- Predict whether transaction_id=42 is fraudulent
PREDICT transactions.is_fraudulent
FOR transactions.transaction_id=42
```

```python theme={null}
result = model.predict(
    "PREDICT transactions.is_fraudulent FOR transactions.transaction_id=42"
)
```

**Output**: Boolean (True/False) and probability per entity.

**Metrics**: acc, auroc, auprc, ap, precision, recall, f1.

### Static Multi-Class Classification

Infer which **single class** an entity belongs to from a set of possible classes.

**Use case**: Customer segmentation, category prediction.

**Supported target type**: Categorical columns

```sql theme={null}
-- Predict the customer segment for customer_id=42
PREDICT customers.segment
FOR customers.customer_id=42
```

```python theme={null}
result = model.predict(
    "PREDICT customers.segment FOR customers.customer_id=42"
)
```

**Output**: Class label and class probabilities per entity.

**Metrics**: acc, precision, recall, f1, mrr.

## Summary

| Task Type                           | PQL Pattern                                                    | Output                | Category |
| ----------------------------------- | -------------------------------------------------------------- | --------------------- | -------- |
| Temporal Regression                 | `PREDICT SUM(t.col, 0, N, days) FOR ...`                       | Numeric               | Temporal |
| Temporal Binary Classification      | `PREDICT COUNT(t.*, 0, N, days) = 0 FOR ...`                   | Boolean + Probability | Temporal |
| Temporal Multi-Class Classification | `PREDICT FIRST(t.col, 0, N, days) FOR ...`                     | Class + Probabilities | Temporal |
| Recommendations                     | `PREDICT LIST_DISTINCT(t.col, 0, N, days) RANK TOP K FOR ...`  | Ranked item list      | Temporal |
| Multi-Horizon Forecasting           | `PREDICT SUM(t.col, 0, N, days) FORECAST K TIMEFRAMES FOR ...` | Time-indexed numerics | Temporal |
| Static Regression                   | `PREDICT t.numeric_col FOR ...`                                | Numeric               | Static   |
| Static Binary Classification        | `PREDICT t.bool_col FOR ...`                                   | Boolean + Probability | Static   |
| Static Multi-Class                  | `PREDICT t.categorical_col FOR ...`                            | Class + Probabilities | Static   |
