How to do batch prediction

There are three ways to do inference with KumoRFM.

Method	Scale	Best Use Case
Single entity	1	Testing / debugging / Agents
Multi-entity	≤ 1,000 per request	Small batches
Batch prediction	Unlimited	Full scoring pipelines

Let’s take a look at how this works by using an e-commerce dataset. First, import the data and setup the graph.

import data & setup graph

import kumoai.experimental.rfm as rfm
import pandas as pd

root = "s3://kumo-sdk-public/rfm-datasets/online-shopping"

df_users = pd.read_parquet(f"{root}/users.parquet")
df_items = pd.read_parquet(f"{root}/items.parquet")
df_orders = pd.read_parquet(f"{root}/orders.parquet")

graph = rfm.LocalGraph.from_data({
    'users': df_users,
    'items': df_items,
    'orders': df_orders,
})

model = rfm.KumoRFM(graph)

Predict how many orders a user will place in the next 30 days. Predictive Query expression:

PREDICT COUNT(orders.*, 0, 30, days)

Single-Entity Inference

Run inference for a single entity:

single-entity

query = "PREDICT COUNT(orders.*, 0, 30, days) FOR users.user_id = 0"
result = model.query(query)

Multi-Entity Inference (≤ 1,000 Entities)

Score multiple specific entities in a single call:

multi-entity

query = (
    "PREDICT COUNT(orders.*, 0, 30, days)"
    "FOR users.user_id IN (0, 1, 2, 3)"
)

result = model.query(query)

Batch Prediction (Full Dataset or Large Batches)

Batch prediction allows scoring all entities. The SDK automatically handles batching and retries.

batch-prediction

# collect all user IDs
indices = df_users["user_id"].tolist()

with model.batch_mode(batch_size="max", num_retries=1):
    result = model.predict(
        "PREDICT COUNT(orders.*, 0, 30, days) FOR EACH users.user_id",
        indices=indices
    )

Parameters

batch_size=“max”: automatically uses the largest valid batch size
num_retries: retry count for transient failures

Get Started

Workflow

How-to

Use Cases

How to do batch prediction

Single-Entity Inference

Multi-Entity Inference (≤ 1,000 Entities)

Batch Prediction (Full Dataset or Large Batches)

Get Started

Workflow

How-to

Use Cases

​Single-Entity Inference

​Multi-Entity Inference (≤ 1,000 Entities)

​Batch Prediction (Full Dataset or Large Batches)

Single-Entity Inference

Multi-Entity Inference (≤ 1,000 Entities)

Batch Prediction (Full Dataset or Large Batches)