Berlin Tech Meetup: The Future of Relational Foundation Models, Systems, and Real-World Applications

Register now:
PyG/Guide7 min read

Time Encoding: Representing Timestamps and Temporal Order in Graph Features

A Unix timestamp is a terrible feature. Time encoding transforms raw timestamps into rich vector representations that capture periodicity, recency, and relative ordering, giving GNNs the ability to learn temporal patterns.

PyTorch Geometric

TL;DR

  • 1Time encoding converts raw timestamps into d-dimensional vectors that capture temporal structure: periodicity (weekday, month, season), recency (time since event), and relative order (before/after relationships).
  • 2Three main approaches: sinusoidal encoding (fixed, captures periodicity), learnable encoding (data-driven), and relative time encoding (models time differences between events).
  • 3Relative time encoding is often more useful than absolute because relational patterns are typically relative: 'repurchased within 7 days' matters more than 'purchased on March 1.'
  • 4Time encoding applies to both edges (when did this transaction happen?) and nodes (when was this account created?). Edge time encodings are added to messages during aggregation.
  • 5KumoRFM uses learned time encodings that capture both absolute calendar features and relative inter-event timing, enabling zero-shot temporal predictions on new databases.

Time encoding converts raw timestamps into dense vector representations that graph neural networks can learn from. Instead of feeding a Unix timestamp (a single large number) into the model, time encoding produces a d-dimensional vector that captures the temporal structure: what day of the week, how recent, how far apart from other events, and what periodic patterns are present.

Why raw timestamps fail

A Unix timestamp like 1709251200 tells a neural network almost nothing useful. It is:

  • Monotonically increasing: The model cannot learn that Friday and Monday are similar (both weekdays) while Friday and Saturday are different.
  • Arbitrarily scaled: The difference between two timestamps in seconds has no natural relationship to the patterns you want to capture.
  • Not periodic: Weekly, monthly, and seasonal patterns are invisible in a linear number.

Sinusoidal time encoding

Borrowed from transformer positional encoding, sinusoidal time encoding uses sine and cosine functions at different frequencies to represent time:

sinusoidal_time.py
import torch
import math

def sinusoidal_time_encoding(timestamps, dim=64):
    """Fixed sinusoidal time encoding."""
    # timestamps: (N,) tensor of Unix timestamps
    t = timestamps.unsqueeze(-1).float()  # (N, 1)

    # Different frequencies capture different periodicities
    freqs = torch.exp(
        torch.arange(0, dim, 2) * -(math.log(10000.0) / dim)
    )  # (dim/2,)

    # Sinusoidal encoding
    enc = torch.zeros(len(timestamps), dim)
    enc[:, 0::2] = torch.sin(t * freqs)  # even dimensions
    enc[:, 1::2] = torch.cos(t * freqs)  # odd dimensions
    return enc

# Low frequencies capture long periods (seasons, years)
# High frequencies capture short periods (hours, days)

Sinusoidal encoding is parameter-free and captures multi-scale periodicity. Low-frequency components encode seasons; high-frequency components encode hours.

Learnable time encoding

Instead of fixed sine/cosine, learn the frequency and phase parameters from data. This adapts to the specific temporal patterns in your dataset:

learnable_time.py
import torch.nn as nn

class LearnableTimeEncoding(nn.Module):
    def __init__(self, dim):
        super().__init__()
        self.freq = nn.Parameter(torch.randn(dim // 2))
        self.phase = nn.Parameter(torch.randn(dim // 2))
        self.linear = nn.Linear(dim, dim)

    def forward(self, t):
        # t: (N,) timestamps
        t = t.unsqueeze(-1)
        enc = torch.cat([
            torch.sin(t * self.freq + self.phase),
            torch.cos(t * self.freq + self.phase),
        ], dim=-1)
        return self.linear(enc)

Learnable encodings discover task-relevant temporal patterns. The model might learn that 'end of billing cycle' matters more than 'day of week.'

Relative time encoding

For many relational tasks, the time between events matters more than the absolute time of each event. Relative time encoding represents the difference between timestamps:

  • Time since last purchase: 3 days (likely to buy again) vs 90 days (at risk of churn)
  • Time between account creation and first transaction: seconds (suspicious) vs days (normal)
  • Time between consecutive transactions: consistent intervals (salary) vs irregular (fraud risk)

Calendar features

In addition to continuous encodings, discrete calendar features capture human-created temporal patterns:

  • Day of week: Shopping behavior differs on weekdays vs weekends
  • Hour of day: Fraud patterns cluster at specific hours
  • Month: Seasonal purchase patterns
  • Is holiday: Spending spikes on holidays
  • Day of month: Payroll patterns (purchases spike on 1st and 15th)

These are typically encoded as learnable embeddings and concatenated with the continuous time encoding to form the final time representation.

Combining time encoding with graph features

Time encodings are incorporated at two levels:

  1. Edge-level: The time encoding of each edge (transaction timestamp, interaction time) is added to or concatenated with the edge's message during message passing.
  2. Node-level: The time encoding of node events (account creation, last login) is added to the node's initial feature vector before the first GNN layer.

Edge-level time encoding is more important because it captures the timing of relationships, which is the primary temporal signal in relational data.

Frequently asked questions

What is time encoding in graph neural networks?

Time encoding converts raw timestamps into dense vector representations that GNNs can learn from. Instead of passing a Unix timestamp as a single number, time encoding produces a d-dimensional vector that captures temporal patterns like periodicity (day of week, month), recency (how recent an event is), and relative ordering (which events came first).

Why not just use raw timestamps as features?

Raw timestamps are poor features: they are monotonically increasing, have arbitrary scale, and do not capture periodicity. A timestamp of 1709251200 (March 1, 2024) tells a neural network nothing about 'Friday' or 'end of month.' Time encodings decompose timestamps into meaningful temporal signals the model can use.

What is the difference between absolute and relative time encoding?

Absolute time encoding represents each timestamp independently (e.g., 'March 1, 2024, 3pm'). Relative time encoding represents the time difference between events (e.g., '5 days after last purchase'). Relative encodings are often more useful because temporal patterns in relational data are typically relative: 'purchased again within 7 days' matters more than the absolute date.

Learn more about graph ML

PyTorch Geometric is the open-source foundation for graph neural networks. Explore more layers, concepts, and production patterns.