What Is Anchor Time and Why Is It Important?

Anchor time represents a specific point in time from which you want to predict the future. Conceptually, it marks the boundary between what is known (past and present) and what is to be predicted (future).

Functions of Anchor Time

Anchor time serves three key purposes:

1. Using Information from the Past (Avoiding Data Leakage)

Kumo only uses information up to and including the anchor time. This mirrors how time works in the real world machine learning–you can only make predictions based on information that’s already known. For example, if the anchor time is set to 2025-10-01 00:00:00, Kumo includes all events that occurred on or before this moment as part of the training or inference data:

✅ Events before the anchor time (e.g. 2025-09-30 23:59:59) are included.
✅ Events at 2025-10-01 00:00:00 are also included (the anchor point itself is known).
❌ Events after 2025-10-01 00:00:00, e.g. 2025-10-01 00:00:01, are not used as input–they belong to the future being predicted.

This design prevents data leakage, ensuring that no information from the future “bleeds” into model training or evaluation.

For tables with a time column: Kumo uses the time column to include rows where time is equal or less than anchor_time.
For tables without a time column: Kumo assumes the table represents static data (e.g., customer demographics) and includes all records.

2. Predicting the Future

Once the anchor time is set, Kumo uses it to predict what happens after that point. For instance, the following pQuery predicts the number of transactions in the next 30 days:

PREDICT COUNT(transactions.*, 0, 30, days)
FOR EACH customers.customer_id

If the anchor time is 2025-10-01 00:00:00, Kumo predicts behavior during the next 30 days: Prediction window: (2025-10-01 00:00:00, 2025-10-31 00:00:00]. The rule of thumb is always left-open / right-closed.

Events exactly at the anchor time (e.g., 2025-10-01 00:00:00) are not the prediction target (because those events are as part of the known data)
Events exactly at the upper bound (2025-10-31 00:00:00) are included in the prediction.

3. Encoding Time Information

The anchor time also plays a critical role in encoding temporal information, both absolute and relative terms.

Relative time is calculated as the time difference between the event time and the anchor time (e.g., the number of days before the anchor time).
Absolute time is to encode anchor time using calendar components such as year, month, day, and day of the week.

This encoding strategy helps the model better understand when events happen and how they relate to each other. It improves model robustness, especially when training across multiple timeframes.

System Default Behavior of Anchor Time

1. Time Column Treatment

All time columns are automatically converted to timestamps.
For example, date value 2025-10-01 becomes 2025-10-01 00:00:00.

2. Anchor time in batch prediction

If no anchor time is specified, Kumo defaults to the latest timestamp in the fact table. For instance, if your prediction is based on transaction table, the batch prediction will default to the maximum timestamp (max(timestamp)) found in that table.

3. Anchor times in training example generation

To generate training examples, Kumo travels back in time and “replays” user behavior at different past time points, sampling data appropriately. It will generate multiple anchor times automatically based on the data and query, and then generate labels based on each anchor time. The interval between anchor times—that is, how Kumo slices the data—is determined by the query itself. For example, if the query uses a 30-day interval, Kumo creates 30-day slices and corresponding anchor times spaced 30 days apart. If the query instead specifies a 7-day interval, Kumo uses 7-day slices and anchor times separated by 7 days.

Special Case: Date as Time Column

This conversion can cause subtle issues in daily-aggregated data. Example:

Time Column	customer_id	total_transactions
2025-09-30	001	1
2025-10-01	001	2
2025-10-02	001	7
…	…	…

Suppose you want to predict customer transactions from 2025-10-01 to 2025-10-30. If you set the anchor time to 2025-10-01, the record for 2025-10-01 is treated as known information, not part of the prediction window. Thus, the effective prediction window becomes: (2025-10-01 00:00:00, 2025-10-31 00:00:00] → includes events from October 2–31 (still 30-day window, but shifted forward by one day).

Solution: Adjusting the Anchor Time

To align your prediction window with your intended period, offset the anchor time by a small delta (e.g. 1 second). For example, set it as 2025-09-30 23:59:59 (1 second before 2025-10-01) Then:

Past data includes everything up to 2025-09-30 23:59:59.
The prediction window becomes (2025-09-30 23:59:59,2025-10-30 23:59:59], which corresponds to predicting events from October 1–30 in your original aggregated data.

FAQ

​Functions of Anchor Time

​1. Using Information from the Past (Avoiding Data Leakage)

​2. Predicting the Future

​3. Encoding Time Information

​System Default Behavior of Anchor Time

​1. Time Column Treatment

​2. Anchor time in batch prediction

​3. Anchor times in training example generation

​Special Case: Date as Time Column

​Solution: Adjusting the Anchor Time