Skip to main content

Documentation Index

Fetch the complete documentation index at: https://kumo.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

This page covers the runtime configuration options for KumoRFM, including run modes, temporal behavior, inference behavior, batch prediction, and retry handling.

Run Modes

The run_mode parameter controls the trade-off between prediction quality and speed by adjusting how much context data is sampled.
Run ModeContext SizeNeighbor SamplingUse Case
DEBUG100[16, 16, 4, 4, 1, 1]Quick iteration, testing queries
FAST1,000[32, 32, 8, 8, 4, 4]Default. Good balance of speed and quality
NORMAL5,000[64, 64, 8, 8, 4, 4]Higher quality predictions
BEST10,000[64, 64, 8, 8, 4, 4]Maximum quality
# Use the fastest mode for quick testing
result = model.predict(query, run_mode="DEBUG")

# Use the highest quality mode for production
result = model.predict(query, run_mode="BEST")
You can also fine-tune the neighbor sampling directly:
result = model.predict(
    query,
    num_neighbors=[64, 64, 8, 8, 4, 4],
    num_hops=3,
)

Temporal and Context Timing

Use these parameters in KumoRFM.predict() and KumoRFM.evaluate() when you need to control the prediction timestamp or the historical examples used as model context.
OptionDefaultDescription
anchor_timeNoneThe anchor timestamp for the prediction. If set to None, KumoRFM uses the maximum timestamp in the data. If set to "entity", KumoRFM uses each entity’s own timestamp.
context_anchor_timeNoneThe maximum anchor timestamp for context examples. If set to None, anchor_time determines the anchor time for context examples.
use_prediction_timeFalseWhether to use the anchor timestamp as an additional feature during prediction. KumoRFM enforces this automatically for time series forecasting tasks.
lag_timesteps0Number of past timesteps to include as lagged target features for temporal predictive queries.
For example, set anchor_time when you want to predict as of a specific point in time:
result = model.predict(
    query,
    indices=[0, 1, 2],
    anchor_time=pd.Timestamp("2024-06-01"),
)
Use context_anchor_time when the prediction date and the latest available context data should differ:
result = model.predict(
    query,
    indices=[0, 1, 2],
    anchor_time=pd.Timestamp("2024-06-01"),
    context_anchor_time=pd.Timestamp("2024-03-01"),
)
Set lag_timesteps when recent historical target values should be available to the model as additional context. For example, lag_timesteps=3 adds the previous three target windows as lagged features:
result = model.predict(
    query,
    indices=[0, 1, 2],
    lag_timesteps=3,
)

Inference Configuration

The inference_config parameter controls inference-time model behavior, including ensembling. You can pass either a dictionary or a configuration object from kumoapi.rfm. When you pass a dictionary, KumoRFM casts it based on the task type:
  • Classification tasks use ClassificationInferenceConfig.
  • Regression and forecasting tasks use RegressionInferenceConfig.
If you omit inference_config, KumoRFM selects defaults automatically based on the task type. Common options:
OptionDescription
num_estimatorsNumber of estimators to ensemble. Defaults to 1 and must be between 1 and 4.
column_shuffleWhether to shuffle column order across estimators.
category_shuffleWhether to shuffle categories within categorical columns across estimators.
hop_shuffleWhether to shuffle subgraph depth across estimators.
Classification option:
OptionDescription
class_shuffleWhether to shuffle class order across estimators.
Regression and forecasting options:
OptionDescription
target_transformsTarget preprocessing transforms to vary across estimators. Supported values are "clip", "power", "quantile", and None. Defaults to ["quantile"].
output_typeHow to summarize the output distribution. Supported values are "median" and "mean". Defaults to "median".
Classification example:
result = model.predict(
    query,
    indices=[0, 1, 2],
    inference_config=dict(
        num_estimators=4,
        column_shuffle=True,
        hop_shuffle=True,
        class_shuffle=True,
    ),
)
Regression example:
result = model.predict(
    regression_query,
    indices=[0, 1, 2],
    inference_config=dict(
        num_estimators=4,
        column_shuffle=True,
        target_transforms=["quantile", "clip", "power", None],
        output_type="median",
    ),
)

Output and Collection Controls

These options control what predict() returns and how KumoRFM collects valid context labels.
OptionDefaultDescription
return_embeddingsFalseWhether to include embeddings for each prediction example in the output DataFrame.
explainFalseWhether to return an Explanation object instead of a plain prediction DataFrame. Explainability currently supports single-entity predictions with run_mode="FAST". See Prediction Explainability.
max_pq_iterations10Maximum number of iterations used to collect valid labels. Increase this when a predictive query has strict entity filters and KumoRFM needs to sample more entities to find enough valid labels.
random_seedfixed seedManual seed for pseudo-random sampling.
verboseTrueWhether to print progress output during prediction or evaluation.
result = model.predict(
    query,
    indices=[0, 1, 2],
    return_embeddings=True,
    max_pq_iterations=20,
    random_seed=42,
)

Batch Mode

For predictions over many entities, use KumoRFM.batch_mode() to automatically split the workload into batches:
with model.batch_mode(batch_size='max', num_retries=1):
    result = model.predict(
        "PREDICT COUNT(orders.*, 0, 30, days) > 0 FOR users.user_id=1",
        indices=list(range(1, 1001)),
    )
Parameters:
  • batch_size: The number of entities per batch. Set to "max" (default) to use the maximum applicable batch size for the task type.
  • num_retries: Number of retries for failed batches due to server issues.
The maximum prediction sizes per task type are:
Task TypeMax Prediction SizeMax Test Size
Classification / Regression / Forecasting1,0002,000

Retry

Use KumoRFM.retry() to automatically retry failed queries due to transient server issues:
with model.retry(num_retries=2):
    result = model.predict(query, indices=[1, 2, 3])
This is useful for long-running batch predictions where occasional failures are expected.

Size Limits

KumoRFM enforces a 30 MB context size limit per prediction. If exceeded, you will see an error message suggesting:
  • Reducing the number of tables in the graph
  • Reducing the number of columns (e.g., large text columns)
  • Adjusting the neighborhood configuration
  • Using a lower run mode
The optimize parameter in KumoRFM can help with database backends by creating indices for faster sampling:
model = rfm.KumoRFM(graph, optimize=True)