Time encoding converts raw timestamps into dense vector representations that graph neural networks can learn from. Instead of feeding a Unix timestamp (a single large number) into the model, time encoding produces a d-dimensional vector that captures the temporal structure: what day of the week, how recent, how far apart from other events, and what periodic patterns are present.
Why raw timestamps fail
A Unix timestamp like 1709251200 tells a neural network almost nothing useful. It is:
- Monotonically increasing: The model cannot learn that Friday and Monday are similar (both weekdays) while Friday and Saturday are different.
- Arbitrarily scaled: The difference between two timestamps in seconds has no natural relationship to the patterns you want to capture.
- Not periodic: Weekly, monthly, and seasonal patterns are invisible in a linear number.
Sinusoidal time encoding
Borrowed from transformer positional encoding, sinusoidal time encoding uses sine and cosine functions at different frequencies to represent time:
import torch
import math
def sinusoidal_time_encoding(timestamps, dim=64):
"""Fixed sinusoidal time encoding."""
# timestamps: (N,) tensor of Unix timestamps
t = timestamps.unsqueeze(-1).float() # (N, 1)
# Different frequencies capture different periodicities
freqs = torch.exp(
torch.arange(0, dim, 2) * -(math.log(10000.0) / dim)
) # (dim/2,)
# Sinusoidal encoding
enc = torch.zeros(len(timestamps), dim)
enc[:, 0::2] = torch.sin(t * freqs) # even dimensions
enc[:, 1::2] = torch.cos(t * freqs) # odd dimensions
return enc
# Low frequencies capture long periods (seasons, years)
# High frequencies capture short periods (hours, days)Sinusoidal encoding is parameter-free and captures multi-scale periodicity. Low-frequency components encode seasons; high-frequency components encode hours.
Learnable time encoding
Instead of fixed sine/cosine, learn the frequency and phase parameters from data. This adapts to the specific temporal patterns in your dataset:
import torch.nn as nn
class LearnableTimeEncoding(nn.Module):
def __init__(self, dim):
super().__init__()
self.freq = nn.Parameter(torch.randn(dim // 2))
self.phase = nn.Parameter(torch.randn(dim // 2))
self.linear = nn.Linear(dim, dim)
def forward(self, t):
# t: (N,) timestamps
t = t.unsqueeze(-1)
enc = torch.cat([
torch.sin(t * self.freq + self.phase),
torch.cos(t * self.freq + self.phase),
], dim=-1)
return self.linear(enc)Learnable encodings discover task-relevant temporal patterns. The model might learn that 'end of billing cycle' matters more than 'day of week.'
Relative time encoding
For many relational tasks, the time between events matters more than the absolute time of each event. Relative time encoding represents the difference between timestamps:
- Time since last purchase: 3 days (likely to buy again) vs 90 days (at risk of churn)
- Time between account creation and first transaction: seconds (suspicious) vs days (normal)
- Time between consecutive transactions: consistent intervals (salary) vs irregular (fraud risk)
Calendar features
In addition to continuous encodings, discrete calendar features capture human-created temporal patterns:
- Day of week: Shopping behavior differs on weekdays vs weekends
- Hour of day: Fraud patterns cluster at specific hours
- Month: Seasonal purchase patterns
- Is holiday: Spending spikes on holidays
- Day of month: Payroll patterns (purchases spike on 1st and 15th)
These are typically encoded as learnable embeddings and concatenated with the continuous time encoding to form the final time representation.
Combining time encoding with graph features
Time encodings are incorporated at two levels:
- Edge-level: The time encoding of each edge (transaction timestamp, interaction time) is added to or concatenated with the edge's message during message passing.
- Node-level: The time encoding of node events (account creation, last login) is added to the node's initial feature vector before the first GNN layer.
Edge-level time encoding is more important because it captures the timing of relationships, which is the primary temporal signal in relational data.