> ## Documentation Index
> Fetch the complete documentation index at: https://kumo.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Configuration

> Configure run modes, temporal behavior, inference behavior, retries, batch mode, and size limits for KumoRFM

This page covers the runtime configuration options for `KumoRFM`, including run modes, temporal behavior, inference behavior, batch prediction, and retry handling.

## Run Modes

The `run_mode` parameter controls the trade-off between prediction quality and speed by adjusting how much context data is sampled.

| Run Mode | Context Size | Neighbor Sampling     | Use Case                                       |
| -------- | ------------ | --------------------- | ---------------------------------------------- |
| `DEBUG`  | 100          | \[16, 16, 4, 4, 1, 1] | Quick iteration, testing queries               |
| `FAST`   | 1,000        | \[32, 32, 8, 8, 4, 4] | **Default.** Good balance of speed and quality |
| `NORMAL` | 5,000        | \[64, 64, 8, 8, 4, 4] | Higher quality predictions                     |
| `BEST`   | 10,000       | \[64, 64, 8, 8, 4, 4] | Maximum quality                                |

<Note>
  For **forecasting tasks**, when an entity has more historical rows than the Context Size cap, KumoRFM uses the **most recent N rows** (not the oldest). This ensures forecasts reflect the latest data patterns.
</Note>

```python theme={null}
# Use the fastest mode for quick testing
result = model.predict(query, run_mode="DEBUG")

# Use the highest quality mode for production
result = model.predict(query, run_mode="BEST")
```

You can also fine-tune the neighbor sampling directly:

```python theme={null}
result = model.predict(
    query,
    num_neighbors=[64, 64, 8, 8, 4, 4],
)
```

## Temporal and Context Timing

Use these parameters in `KumoRFM.predict()` and `KumoRFM.evaluate()` when you need to control the prediction timestamp or the historical examples used as model context.

| Option                | Default | Description                                                                                                                                                                     |
| --------------------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `anchor_time`         | `None`  | The anchor timestamp for the prediction. If set to `None`, KumoRFM uses the maximum timestamp in the data. If set to `"entity"`, KumoRFM uses each entity's own timestamp.      |
| `context_anchor_time` | `None`  | The maximum anchor timestamp for context examples. If set to `None`, `anchor_time` determines the anchor time for context examples.                                             |
| `use_prediction_time` | `False` | Whether to use the anchor timestamp as an additional feature during prediction. When `True`, the anchor time is included as a feature for all task types including forecasting. |
| `lag_timesteps`       | `0`     | Number of past timesteps to include as lagged target features for temporal predictive queries.                                                                                  |

For example, set `anchor_time` when you want to predict as of a specific point in time:

```python theme={null}
result = model.predict(
    query,
    indices=[0, 1, 2],
    anchor_time=pd.Timestamp("2024-06-01"),
)
```

Use `context_anchor_time` when the prediction date and the latest available context data should differ:

```python theme={null}
result = model.predict(
    query,
    indices=[0, 1, 2],
    anchor_time=pd.Timestamp("2024-06-01"),
    context_anchor_time=pd.Timestamp("2024-03-01"),
)
```

Set `lag_timesteps` when recent historical target values should be available to the model as additional context. For example, `lag_timesteps=3` adds the previous three target windows as lagged features:

```python theme={null}
result = model.predict(
    query,
    indices=[0, 1, 2],
    lag_timesteps=3,
)
```

## Inference Configuration

The `inference_config` parameter controls inference-time model behavior, including ensembling. You can pass either a dictionary or a configuration object from `kumoapi.rfm`.

When you pass a dictionary, KumoRFM casts it based on the task type:

* Classification tasks use `ClassificationInferenceConfig`.
* Regression and forecasting tasks use `RegressionInferenceConfig`.

If you omit `inference_config`, KumoRFM selects defaults automatically based on the task type.

Common options:

| Option             | Description                                                                        |
| ------------------ | ---------------------------------------------------------------------------------- |
| `num_estimators`   | Number of estimators to ensemble. Defaults to `1` and must be between `1` and `4`. |
| `column_shuffle`   | Whether to shuffle column order across estimators.                                 |
| `category_shuffle` | Whether to shuffle categories within categorical columns across estimators.        |
| `hop_shuffle`      | Whether to shuffle subgraph depth across estimators.                               |

Classification option:

| Option          | Description                                       |
| --------------- | ------------------------------------------------- |
| `class_shuffle` | Whether to shuffle class order across estimators. |

Regression and forecasting options:

| Option              | Description                                                                                                                                                |
| ------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `target_transforms` | Target preprocessing transforms to vary across estimators. Supported values are `"clip"`, `"power"`, `"quantile"`, and `None`. Defaults to `["quantile"]`. |
| `output_type`       | How to summarize the output distribution. Supported values are `"median"`, `"mean"`, and `"quantiles"`. Defaults to `"median"`.                            |

When `output_type="quantiles"`, the prediction output contains 27 quantile columns instead of a single `TARGET_PRED` column:

```
q_0.005 q_0.01  q_0.02  q_0.025 q_0.05  q_0.1   q_0.15
q_0.2   q_0.25  q_0.3   q_0.35  q_0.4   q_0.45  q_0.5
q_0.55  q_0.6   q_0.65  q_0.7   q_0.75  q_0.8   q_0.85
q_0.9   q_0.95  q_0.975 q_0.98  q_0.99  q_0.995
```

Classification example:

```python theme={null}
result = model.predict(
    query,
    indices=[0, 1, 2],
    inference_config=dict(
        num_estimators=4,
        column_shuffle=True,
        hop_shuffle=True,
        class_shuffle=True,
    ),
)
```

Regression example:

```python theme={null}
result = model.predict(
    regression_query,
    indices=[0, 1, 2],
    inference_config=dict(
        num_estimators=4,
        column_shuffle=True,
        target_transforms=["quantile", "clip", "power", None],
        output_type="median",
    ),
)
```

## Output and Collection Controls

These options control what `predict()` returns and how KumoRFM collects valid context labels.

| Option              | Default | Description                                                                                                                                                                                                                             |
| ------------------- | ------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `return_embeddings` | `False` | Whether to include embeddings for each prediction example in the output DataFrame.                                                                                                                                                      |
| `explain`           | `False` | Whether to return an `Explanation` object instead of a plain prediction DataFrame. Explainability currently supports single-entity predictions with `run_mode="FAST"`. See [Prediction Explainability](/rfm/prediction-explainability). |
| `max_pq_iterations` | `10`    | Maximum number of iterations used to collect valid labels. Increase this when a predictive query has strict entity filters and KumoRFM needs to sample more entities to find enough valid labels.                                       |
| `random_seed`       | `42`    | Manual seed for pseudo-random sampling.                                                                                                                                                                                                 |
| `verbose`           | `True`  | Whether to print progress output during prediction or evaluation.                                                                                                                                                                       |

```python theme={null}
result = model.predict(
    query,
    indices=[0, 1, 2],
    return_embeddings=True,
    max_pq_iterations=20,
    random_seed=42,
)
```

## Batch Mode

For predictions over many entities, use `KumoRFM.batch_mode()` to automatically split the workload into batches:

```python theme={null}
with model.batch_mode(batch_size='max', num_retries=1):
    result = model.predict(
        "PREDICT COUNT(orders.*, 0, 30, days) > 0 FOR users.user_id=1",
        indices=list(range(1, 1001)),
    )
```

Parameters:

* `batch_size`: The number of entities per batch. Set to `"max"` (default) to use the maximum applicable batch size for the task type.
* `num_retries`: Number of retries for failed batches due to server issues.

The maximum prediction sizes per task type are:

| Task Type                                 | Max Prediction Size | Max Test Size |
| ----------------------------------------- | ------------------- | ------------- |
| Classification / Regression / Forecasting | 1,000               | 2,000         |
| Temporal Link Prediction                  | 200                 | 400           |

## Retry

Use `KumoRFM.retry()` to automatically retry failed queries due to transient server issues:

```python theme={null}
with model.retry(num_retries=2):
    result = model.predict(query, indices=[1, 2, 3])
```

This is useful for long-running batch predictions where occasional failures are expected.

## Size Limits

KumoRFM enforces a **30 MB context size limit** per prediction. If exceeded, you will see an error message suggesting:

* Reducing the number of tables in the graph
* Reducing the number of columns (e.g., large text columns)
* Adjusting the neighborhood configuration
* Using a lower run mode

The `optimize` parameter in `KumoRFM` can help with database backends by creating indices for faster sampling:

```python theme={null}
model = rfm.KumoRFM(graph, optimize=True)
```