Berlin Tech Meetup: The Future of Relational Foundation Models, Systems, and Real-World Applications

Register now:
PyG/Guide6 min read

Feature Propagation: Spreading Known Features to Unlabeled Nodes

Feature propagation fills in missing node features by diffusing known values across graph edges. It is a graph-aware preprocessing step that handles incomplete data using the connectivity structure of the graph itself.

PyTorch Geometric

TL;DR

  • 1Feature propagation spreads known node features across edges to impute missing values. If 80% of nodes have features, the other 20% get features based on their neighbors.
  • 2It works by iterative diffusion: missing-feature nodes average their neighbors' features each step. Known-feature nodes stay anchored. Convergence is typically fast (10-50 iterations).
  • 3This is a preprocessing step, not a replacement for GNNs. Run feature propagation first to complete the feature matrix, then train your GNN on the complete data.
  • 4Enterprise data is always incomplete: new customers have no history, some records lack fields, different data sources have different coverage. Feature propagation handles this gracefully.
  • 5Better than mean imputation because it uses graph structure. A new customer connected to premium buyers gets premium-like features, not the global average.

Feature propagation spreads known labels or features to unlabeled nodes using graph structure. In real-world graphs, many nodes have incomplete or missing features. A new customer has no purchase history. A recently added product has no reviews. A newly discovered entity in a knowledge graph has no attributes. Feature propagation fills in these gaps by diffusing information from feature-rich neighbors to feature-poor nodes.

This is the graph-structured equivalent of data imputation, but instead of using column-level statistics (mean, median), it uses the local graph neighborhood. A new customer connected to high-spending customers gets high-spending-like features, not the global average.

How it works

Feature propagation is iterative:

  1. Initialize: nodes with known features keep their values. Nodes with missing features start at zero (or random).
  2. Propagate: each missing-feature node takes the weighted average of its neighbors' current features.
  3. Anchor: known-feature nodes reset to their original values (they are not overwritten).
  4. Repeat: iterate steps 2-3 until convergence (typically 10-50 iterations).
feature_propagation.py
import torch
from torch_geometric.nn import MessagePassing
from torch_geometric.utils import add_self_loops, degree

def feature_propagation(x, edge_index, known_mask, num_iterations=40):
    """
    Propagate known features to nodes with missing features.

    x: [num_nodes, num_features] - feature matrix (zeros for missing)
    known_mask: [num_nodes] bool - True for nodes with known features
    """
    original_features = x.clone()

    edge_index, _ = add_self_loops(edge_index)
    row, col = edge_index
    deg = degree(col, x.size(0))
    deg_inv = 1.0 / deg.clamp(min=1)

    for _ in range(num_iterations):
        # Aggregate neighbor features (mean)
        out = torch.zeros_like(x)
        out.index_add_(0, col, x[row])
        out = out * deg_inv.unsqueeze(-1)

        # Anchor: keep known features, update only missing
        x = torch.where(known_mask.unsqueeze(-1), original_features, out)

    return x

# 1000 nodes, 80% have features, 20% missing
known_mask = torch.rand(1000) > 0.2
x_imputed = feature_propagation(x, edge_index, known_mask)

Simple feature propagation: average neighbors, anchor known values, repeat. Missing nodes converge to locally consistent values.

Enterprise example: cold-start customers

An e-commerce platform has 1 million customers. 800,000 have complete profiles (age, spending patterns, product preferences). 200,000 are new with minimal data. The customer-product interaction graph connects customers to their purchased items.

Feature propagation for new customers:

  • New customer Alice bought 2 products
  • Those products were also bought by 50 other customers with full profiles
  • Feature propagation averages those 50 customers' profiles
  • Alice gets imputed features reflecting her purchase-based cohort

This is more informative than zero-filling or mean imputation because it places Alice in the right neighborhood of customer space based on her behavior, not demographics.

When to use feature propagation

  • Cold start: new entities with no history (new users, new products)
  • Incomplete records: some data sources have gaps (missing demographics, missing attributes)
  • Data integration: merging datasets with different column coverage
  • Dynamic graphs: newly added nodes need features before the GNN runs

Frequently asked questions

What is feature propagation?

Feature propagation spreads known node features across graph edges to fill in missing values. If 80% of nodes have features and 20% do not, feature propagation iteratively diffuses known features through the graph structure, imputing values for nodes with missing data based on their neighborhood.

How does feature propagation work?

It is iterative diffusion: at each step, nodes with missing features take the weighted average of their neighbors' features. Nodes with known features keep their original values (they are anchored). After several iterations, missing features converge to values consistent with the local graph structure.

When should I use feature propagation?

Use it when your graph has incomplete node features: some nodes have full feature vectors, others have partial or no features. Common in enterprise data where some records are incomplete, some entities are newly added (cold start), or different data sources have different coverage.

What is the difference between feature propagation and label propagation?

Feature propagation imputes missing continuous features (like age, income, or embedding vectors). Label propagation imputes missing discrete labels (like fraud/not-fraud, category). Both use the same diffusion mechanism but operate on different data types and serve different purposes.

Does feature propagation replace GNNs?

No. Feature propagation is a preprocessing step that handles missing data. It runs before the GNN to ensure all nodes have features. The GNN then learns from the complete feature matrix. Feature propagation is to graphs what mean imputation is to tabular data, but it uses graph structure instead of column statistics.

Learn more about graph ML

PyTorch Geometric is the open-source foundation for graph neural networks. Explore more layers, concepts, and production patterns.