Berlin Tech Meetup: The Future of Relational Foundation Models, Systems, and Real-World Applications

Register now:
PyG/Guide7 min read

The Cold-Start Problem: Making Predictions for Nodes with No History

A new user signs up. No purchase history. No browsing data. No reviews. Traditional ML cannot score them. GNNs can, because they compute embeddings from features and graph connectivity, not from historical lookup tables.

PyTorch Geometric

TL;DR

  • 1Cold start: new entities (users, products, accounts) with no interaction history cannot be scored by models that rely on historical features or learned lookup vectors.
  • 2GNNs solve cold start structurally: a new node's embedding is computed from its attribute features and its initial graph connections through message passing. No historical data needed.
  • 3Shallow embeddings (Node2Vec, matrix factorization) fail on cold start because they are transductive: each node needs a pre-trained vector. New nodes have no vector.
  • 4In enterprise applications, cold start is continuous: new customers, new products, and new accounts arrive every day. GNNs handle this without retraining; shallow methods require periodic re-embedding.
  • 5The quality of cold-start predictions depends on the information available: more attribute features and more initial connections produce better embeddings. Even minimal information (category, referral source) provides useful signal.

The cold-start problem is the inability to make predictions for entities with no historical data. A new user on an e-commerce platform has no purchase history, no browsing sessions, no reviews. Collaborative filtering cannot recommend products (no learned user vector). A tabular churn model cannot score them (no historical features to compute). A fraud model cannot assess their risk (no transaction patterns).

GNNs address cold start fundamentally differently from traditional approaches: they compute embeddings from what a node is (features) and who it connects to (graph structure), not from what it has done (history).

Why traditional methods fail

Collaborative filtering

Matrix factorization learns a latent vector for each user and each item by factorizing the interaction matrix. User-item affinity is the dot product of their vectors. A new user has no vector because they have no interactions to learn from. The model literally has no representation for them.

Shallow graph embeddings

Node2Vec and DeepWalk learn a fixed vector per node from random walks. New nodes were not in the training graph, so they have no embedding. Retraining on the updated graph is required, but by the time retraining finishes, more new nodes have arrived.

Feature-based tabular models

A churn model with features like “number of orders in last 30 days” and “average session duration” produces all-zero features for a new user. The model defaults to the population base rate, providing no personalization.

How GNNs solve cold start

GNNs are inductive: they compute embeddings from features and structure, not from lookup tables. A new node can be embedded immediately if it has:

1. Attribute features

Even new entities have attributes. A new user has age, location, device type, and registration channel. A new product has title, category, price, and brand. These attributes are encoded as the node's initial feature vector, providing a starting representation.

2. Initial connections

A new user who signed up via a referral link is connected to the referrer. A new user who browsed the “electronics” category is connected to that category node. A new product in the “shoes” category is connected to the category node and the brand node.

These initial connections, even if sparse, provide graph context. Through message passing, the new node receives information from its connected nodes, which have rich historical embeddings.

3. Type information

In heterogeneous graphs, node type provides immediate context. A “premium account” node type inherits different initial representations from a “free tier” node type. Type-specific encodings provide a useful prior before any interaction data exists.

Cold start quality depends on available signal

Not all cold starts are equal. The prediction quality depends on what information the new node brings:

  • Rich features + connections: new user with demographic data, referred by an active user, browsed several categories. High-quality embedding.
  • Features only: new user with demographic data but no connections yet. Moderate quality; the GNN uses features but cannot leverage graph context.
  • Connections only: new product with no description but assigned to a category. Moderate quality; the GNN inherits category context.
  • Minimal information: anonymous user, no features, no connections. Lowest quality; the model falls back to population-level priors.

Enterprise impact

In enterprise applications, cold start is not an edge case. It is continuous:

  • E-commerce: thousands of new products listed daily, millions of new user sessions weekly
  • Financial services: new accounts opened hourly, new merchants onboarding daily
  • Healthcare: new patients admitted, new drugs prescribed, new diagnoses recorded

Systems that handle cold start poorly lose value on every new entity until enough history accumulates. GNN-based systems provide useful predictions from day one.

Frequently asked questions

What is the cold-start problem?

The cold-start problem occurs when a system needs to make predictions for entities with no historical data. A new user with no purchase history cannot receive collaborative filtering recommendations. A new product with no reviews cannot be scored for relevance. A new account with no transaction history cannot be assessed for fraud risk. Traditional ML models require historical features that cold-start entities lack.

How do GNNs solve the cold-start problem?

GNNs compute embeddings from features and graph connectivity, not from historical lookup tables. A new user with demographic features (age, location) and initial connections (signed up via referral, browsed 3 categories) can be embedded immediately. The GNN propagates information from the user's initial connections to generate a meaningful representation. No historical data required.

Why can't collaborative filtering handle cold start?

Collaborative filtering learns latent vectors for each user and item from interaction history. New users and items have no interaction history, so they have no latent vector. You cannot compute their dot product with candidate items. Heuristics (recommend popular items, use demographic averages) are the only option. GNNs avoid this because embeddings are computed from features, not looked up from history.

What information do cold-start nodes have?

Even with no interaction history, cold-start nodes typically have: (1) attribute features (demographics, product specifications, account settings), (2) initial connections (referral source, registration channel, category preferences), (3) type information (user type, product category, account tier). GNNs use all of these to generate embeddings through message passing from connected nodes.

Learn more about graph ML

PyTorch Geometric is the open-source foundation for graph neural networks. Explore more layers, concepts, and production patterns.