This page covers the runtime configuration options for KumoRFM, including run modes, temporal behavior, inference behavior, batch prediction, and retry handling.
Run Modes
The run_mode parameter controls the trade-off between prediction quality and speed by adjusting how much context data is sampled.
| Run Mode | Context Size | Neighbor Sampling | Use Case |
|---|
DEBUG | 100 | [16, 16, 4, 4, 1, 1] | Quick iteration, testing queries |
FAST | 1,000 | [32, 32, 8, 8, 4, 4] | Default. Good balance of speed and quality |
NORMAL | 5,000 | [64, 64, 8, 8, 4, 4] | Higher quality predictions |
BEST | 10,000 | [64, 64, 8, 8, 4, 4] | Maximum quality |
For forecasting tasks, when an entity has more historical rows than the Context Size cap, KumoRFM uses the most recent N rows (not the oldest). This ensures forecasts reflect the latest data patterns.
# Use the fastest mode for quick testing
result = model.predict(query, run_mode="DEBUG")
# Use the highest quality mode for production
result = model.predict(query, run_mode="BEST")
You can also fine-tune the neighbor sampling directly:
result = model.predict(
query,
num_neighbors=[64, 64, 8, 8, 4, 4],
)
Temporal and Context Timing
Use these parameters in KumoRFM.predict() and KumoRFM.evaluate() when you need to control the prediction timestamp or the historical examples used as model context.
| Option | Default | Description |
|---|
anchor_time | None | The anchor timestamp for the prediction. If set to None, KumoRFM uses the maximum timestamp in the data. If set to "entity", KumoRFM uses each entity’s own timestamp. |
context_anchor_time | None | The maximum anchor timestamp for context examples. If set to None, anchor_time determines the anchor time for context examples. |
use_prediction_time | False | Whether to use the anchor timestamp as an additional feature during prediction. When True, the anchor time is included as a feature for all task types including forecasting. |
lag_timesteps | 0 | Number of past timesteps to include as lagged target features for temporal predictive queries. |
For example, set anchor_time when you want to predict as of a specific point in time:
result = model.predict(
query,
indices=[0, 1, 2],
anchor_time=pd.Timestamp("2024-06-01"),
)
Use context_anchor_time when the prediction date and the latest available context data should differ:
result = model.predict(
query,
indices=[0, 1, 2],
anchor_time=pd.Timestamp("2024-06-01"),
context_anchor_time=pd.Timestamp("2024-03-01"),
)
Set lag_timesteps when recent historical target values should be available to the model as additional context. For example, lag_timesteps=3 adds the previous three target windows as lagged features:
result = model.predict(
query,
indices=[0, 1, 2],
lag_timesteps=3,
)
Inference Configuration
The inference_config parameter controls inference-time model behavior, including ensembling. You can pass either a dictionary or a configuration object from kumoapi.rfm.
When you pass a dictionary, KumoRFM casts it based on the task type:
- Classification tasks use
ClassificationInferenceConfig.
- Regression and forecasting tasks use
RegressionInferenceConfig.
If you omit inference_config, KumoRFM selects defaults automatically based on the task type.
Common options:
| Option | Description |
|---|
num_estimators | Number of estimators to ensemble. Defaults to 1 and must be between 1 and 4. |
column_shuffle | Whether to shuffle column order across estimators. |
category_shuffle | Whether to shuffle categories within categorical columns across estimators. |
hop_shuffle | Whether to shuffle subgraph depth across estimators. |
Classification option:
| Option | Description |
|---|
class_shuffle | Whether to shuffle class order across estimators. |
Regression and forecasting options:
| Option | Description |
|---|
target_transforms | Target preprocessing transforms to vary across estimators. Supported values are "clip", "power", "quantile", and None. Defaults to ["quantile"]. |
output_type | How to summarize the output distribution. Supported values are "median", "mean", and "quantiles". Defaults to "median". |
When output_type="quantiles", the prediction output contains 27 quantile columns instead of a single TARGET_PRED column:
q_0.005 q_0.01 q_0.02 q_0.025 q_0.05 q_0.1 q_0.15
q_0.2 q_0.25 q_0.3 q_0.35 q_0.4 q_0.45 q_0.5
q_0.55 q_0.6 q_0.65 q_0.7 q_0.75 q_0.8 q_0.85
q_0.9 q_0.95 q_0.975 q_0.98 q_0.99 q_0.995
Classification example:
result = model.predict(
query,
indices=[0, 1, 2],
inference_config=dict(
num_estimators=4,
column_shuffle=True,
hop_shuffle=True,
class_shuffle=True,
),
)
Regression example:
result = model.predict(
regression_query,
indices=[0, 1, 2],
inference_config=dict(
num_estimators=4,
column_shuffle=True,
target_transforms=["quantile", "clip", "power", None],
output_type="median",
),
)
Output and Collection Controls
These options control what predict() returns and how KumoRFM collects valid context labels.
| Option | Default | Description |
|---|
return_embeddings | False | Whether to include embeddings for each prediction example in the output DataFrame. |
explain | False | Whether to return an Explanation object instead of a plain prediction DataFrame. Explainability currently supports single-entity predictions with run_mode="FAST". See Prediction Explainability. |
max_pq_iterations | 10 | Maximum number of iterations used to collect valid labels. Increase this when a predictive query has strict entity filters and KumoRFM needs to sample more entities to find enough valid labels. |
random_seed | 42 | Manual seed for pseudo-random sampling. |
verbose | True | Whether to print progress output during prediction or evaluation. |
result = model.predict(
query,
indices=[0, 1, 2],
return_embeddings=True,
max_pq_iterations=20,
random_seed=42,
)
Batch Mode
For predictions over many entities, use KumoRFM.batch_mode() to automatically split the workload into batches:
with model.batch_mode(batch_size='max', num_retries=1):
result = model.predict(
"PREDICT COUNT(orders.*, 0, 30, days) > 0 FOR users.user_id=1",
indices=list(range(1, 1001)),
)
Parameters:
batch_size: The number of entities per batch. Set to "max" (default) to use the maximum applicable batch size for the task type.
num_retries: Number of retries for failed batches due to server issues.
The maximum prediction sizes per task type are:
| Task Type | Max Prediction Size | Max Test Size |
|---|
| Classification / Regression / Forecasting | 1,000 | 2,000 |
| Temporal Link Prediction | 200 | 400 |
Retry
Use KumoRFM.retry() to automatically retry failed queries due to transient server issues:
with model.retry(num_retries=2):
result = model.predict(query, indices=[1, 2, 3])
This is useful for long-running batch predictions where occasional failures are expected.
Size Limits
KumoRFM enforces a 30 MB context size limit per prediction. If exceeded, you will see an error message suggesting:
- Reducing the number of tables in the graph
- Reducing the number of columns (e.g., large text columns)
- Adjusting the neighborhood configuration
- Using a lower run mode
The optimize parameter in KumoRFM can help with database backends by creating indices for faster sampling:
model = rfm.KumoRFM(graph, optimize=True)