Overview
This document provides a step-by-step guide to improving the performance of KumoRFM, a graph-based relational foundation model for predictive analytics. The tutorial demonstrates how to evaluate and optimize model settings on the H&M dataset, available through the RelBench repository.1. Introduction
The H&M database contains extensive customer and product data from the company’s e-commerce operations. It includes detailed purchase histories and metadata, ranging from demographic information to product attributes. In this example, we predict the total price perarticle_id over a 7-day window (start_date, start_date + 7 days], but only for items that had non-zero sales in the previous 7 days.
- Prediction Window:
(2020-09-07, 2020-09-14] - Transactions on
2020-09-07are excluded.
2. Environment Setup
Install required packages:3. Data Loading
Download the dataset and prepare data frames for each table:4. Ground Truth Calculation
Define a reference function to compute the target variable — total sales for each item within the next 7 days, restricted to items with prior sales activity.5. Model Initialization
Initialize the local relational graph and create a KumoRFM model instance.6. Evaluation Helper Function
Define an evaluation function for consistent benchmarking.7. Performance Optimization
7.1 Align Predictive Query with Criteria
Ensure that the predictive query aligns with the business logic of the task—only predict for articles that had prior sales activity.WHERE SUM(transactions.price, -7, 0, days) > 0 is not included, all article_ids can be sampled during in-context sampling. However, when the condition is applied, only the eligible article items (those meeting the criteria) are sampled.
Evaluate both queries:
| Run Mode | Num Neighbors | Use Prediction Time | Query | MAE | Notes |
|---|---|---|---|---|---|
| fast | [] | False | query1 | 0.3913 | Baseline |
| fast | [] | False | query2 | 0.3745 | Better performance |
7.2 Tune Neighborhood Sampling
Adjusting the number of neighbors (num_neighbors) affects the model’s receptive field. Increasing neighbors generally improves performance, while too many hops may introduce noise.
| Run Mode | Num Neighbors | Use Prediction Time | Query | MAE | Notes |
|---|---|---|---|---|---|
| fast | [] | False | query2 | 0.3745 | Baseline |
| fast | [8] | False | query2 | 0.3297 | Improved |
| fast | [8, 8] | False | query2 | 0.3276 | Slightly better |
| fast | [32] | False | query2 | 0.2506 | Significant gain |
| fast | [32, 32] | False | query2 | 0.3352 | Performance drop |
| fast | [64] | False | query2 | 0.2493 | Best performance |
| fast | [64, 64] | False | query2 | 0.3350 | Performance drop |
7.3 Adjust Run Mode
run_mode controls the number of in-context examples used during prediction:
| Run Mode | In-Context Examples | Description |
|---|---|---|
| fast | 1,000 | Quick but less accurate |
| normal | 5,000 | Balanced |
| best | 10,000 | Highest accuracy |
| Run Mode | Num Neighbors | Use Prediction Time | Query | MAE | Notes |
|---|---|---|---|---|---|
| fast | [64] | False | query2 | 0.2493 | Baseline |
| normal | [64] | False | query2 | 0.2156 | Improved |
| best | [64] | False | query2 | 0.2106 | Best performance |
7.4 Enable Prediction Time Feature (Optional)
Includingprediction_time as a feature can capture temporal seasonality. In this example, enabling it did not improve results.
7.5 Add or Remove Features from the Graph
By default, all the data in the graph are used for prediction. However, too fine-grained features may introduce noise as part of in-context learning. Removing those features can improve performance. Similarly, you may improve model performance by adding additional signals by providing new tables or new column features in existing table.8. Results Summary
By iteratively refining predictive query alignment, neighborhood sampling, and run mode, MAE improved from 0.3913 → 0.2106. Best Configuration:9. Key Takeaways
- Align predictive queries with business logic and in-context sampling.
- Optimize neighborhood sampling (
num_neighbors). - Use higher
run_modevalues for accuracy-sensitive applications.