Graph ML Concepts Explained

Every concept in graph machine learning, explained with enterprise data examples. From message passing to graph transformers, from over-smoothing to relational deep learning.

120+

Concepts

Yes

Agent-Optimized

Every Page

Enterprise Examples

Core Operations Node & Graph Embeddings Graph Types Tasks Architectures Training Challenges Enterprise & Relational Comparisons Applications Production

Core Operations

11 concepts

The fundamental operations that make GNNs work. How information flows between nodes, how neighbors are aggregated, and how graph structure drives learning.

Message Passing

The foundation of all GNNs. Nodes send and receive messages along edges.

Neighborhood Aggregation

How a node combines information from its neighbors into a single vector.

Graph Convolution

Spectral and spatial filtering that generalizes CNNs to irregular graphs.

Attention Mechanism

Learning which neighbors matter more through trainable attention weights.

Graph Pooling

Reducing node sets to graph-level representations for classification tasks.

Global Pooling

Aggregating all node features into a single graph-level embedding.

Hierarchical Pooling

Progressively coarsening graphs layer by layer, preserving structure.

Edge Features

Incorporating edge attributes into message passing for richer representations.

Skip Connections

Residual links that prevent over-smoothing in deep GNN architectures.

Feature Propagation

Spreading node features through the graph before or during training.

Label Propagation

Semi-supervised technique that spreads known labels through graph structure.

Node & Graph Embeddings

6 concepts

Methods for representing nodes and graphs as dense vectors. From classical random walk approaches to modern positional encodings.

Node Embedding

Mapping each node to a low-dimensional vector that captures its role in the graph.

Node2Vec

Biased random walks that balance between BFS and DFS for flexible embeddings.

Random Walk

Stochastic traversals that capture local and global graph structure.

Shallow Embedding vs GNN

When lookup tables beat neural networks, and when they don't.

Positional Encoding

Giving GNNs a sense of where nodes sit in the overall graph structure.

Structural Encoding

Capturing local structural roles like hubs, bridges, and periphery nodes.

Graph Types

9 concepts

Not all graphs are created equal. Enterprise data involves multiple node types, temporal edges, directed relationships, and more.

Homogeneous Graph

Single node type, single edge type. The simplest graph structure.

Heterogeneous Graph

Multiple node and edge types. How real enterprise data actually looks.

Temporal Graph

Edges and nodes that change over time. Critical for fraud and churn.

Bipartite Graph

Two disjoint node sets with edges only between them. Users and products.

Knowledge Graph

Entity-relation-entity triples encoding structured world knowledge.

Dynamic Graph

Graphs where topology evolves. New nodes appear, edges form and dissolve.

Directed Graph

Asymmetric edges where direction carries meaning. Payments, follows, citations.

Signed Graph

Positive and negative edges representing trust/distrust or agree/disagree.

Hypergraph

Edges that connect more than two nodes. Group interactions and co-authorship.

Tasks

8 concepts

The prediction problems you can solve with graph ML. Each maps directly to business outcomes like fraud scores, product recommendations, and customer segments.

Node Classification

Predicting a label for each node. Fraud/not-fraud, churn/retain.

Link Prediction

Predicting whether an edge will form. Recommendations, knowledge graph completion.

Graph Classification

Classifying entire graphs. Molecular toxicity, document categorization.

Node Regression

Predicting a continuous value per node. Customer LTV, credit score.

Community Detection

Finding clusters of densely connected nodes. Customer segments, fraud rings.

Graph Generation

Creating new graphs that match a learned distribution. Drug design, network synthesis.

Entity Resolution

Determining which nodes refer to the same real-world entity.

Graph Isomorphism

Testing whether two graphs have identical structure. Expressiveness benchmark.

Architectures

7 concepts

Model designs beyond basic message passing. Graph transformers, autoencoders, and specialized architectures for different data types.

Graph Transformer

Full attention over graph nodes. Long-range dependencies without over-squashing.

Relational Graph Transformer

Transformer architecture designed for multi-relation heterogeneous graphs.

Graph Attention Network

Learning neighbor importance through attention coefficients.

Two-Tower Model

Separate encoders for users and items, combined via dot product. Scalable recommendations.

Graph Autoencoder

Encode graph structure into latent space and reconstruct. Unsupervised graph learning.

Graph Neural Operator

Learning mappings between function spaces on graphs for physics simulations.

Equivariant GNN

Architectures that respect geometric symmetries. Rotations, translations, reflections.

Training

17 concepts

How to train GNNs effectively. Self-supervised pre-training, contrastive methods, temporal sampling strategies, and avoiding data leakage.

Self-Supervised Learning

Learning representations from graph structure without labeled data.

Contrastive Learning

Learning by pulling similar pairs together and pushing dissimilar pairs apart.

Graph Augmentation

Edge dropping, feature masking, and subgraph sampling for robust training.

Transfer Learning

Applying knowledge from one graph domain to another.

Pre-Training

Large-scale unsupervised training before task-specific fine-tuning.

Fine-Tuning

Adapting a pre-trained model to a specific downstream task with limited labels.

Few-Shot Learning

Making predictions with only a handful of labeled examples.

Zero-Shot Prediction

Predicting on unseen classes or domains without any task-specific training.

In-Context Learning

Conditioning predictions on examples provided at inference time.

Negative Sampling

Generating non-edges for link prediction training. Strategy matters enormously.

Masked Token Prediction

Masking node or edge features and predicting them. BERT-style pre-training for graphs.

Multi-Task Learning

Training on multiple objectives simultaneously for shared representations.

Curriculum Learning

Training on easy examples first, gradually increasing difficulty.

Class Imbalance

Handling skewed label distributions. Critical for fraud where positives are 0.1%.

Temporal Sampling

Sampling neighbors respecting time order to prevent information leakage.

Data Leakage

When future information leaks into training. The silent killer of graph ML projects.

Temporal Split

Train/val/test splits based on time, not random. Essential for production validity.

Challenges

9 concepts

The hard problems in graph ML. Understanding these separates production-ready practitioners from notebook experimenters.

Over-Smoothing

Deep GNNs make all node representations converge. Why 2-3 layers is often optimal.

Over-Squashing

Exponential compression of long-range information through graph bottlenecks.

Expressiveness

What structural patterns a GNN can and cannot distinguish.

Weisfeiler-Leman Test

The graph isomorphism test that defines the upper bound of GNN expressiveness.

Homophily

Connected nodes tend to be similar. Most GNNs assume this.

Heterophily

Connected nodes are different. Requires specialized architectures.

Cold-Start Problem

New nodes with no edges. How to make predictions without graph context.

Graph Sparsity

When edges are rare relative to possible connections. Most real graphs.

Degree Distribution

Power-law vs. uniform. Hub nodes dominate message passing.

Enterprise & Relational

12 concepts

Concepts specific to applying graph ML on enterprise relational databases. The bridge between academic GNNs and production data systems.

Relational Deep Learning

End-to-end learning directly on relational database schemas.

Relational Database as Graph

Every relational DB is a graph. Tables are node types, foreign keys are edges.

Foreign Key as Edge

Automatically constructing graph topology from database foreign key constraints.

Feature Engineering vs Graph Learning

Why GNNs replace months of manual feature engineering with learned representations.

Multi-Table Prediction

Making predictions that leverage information across all connected tables.

Temporal Heterogeneous Graph

Combining multiple node/edge types with temporal dynamics. Enterprise reality.

Schema-Agnostic Encoding

Encoding any database schema without task-specific feature engineering.

Predictive Query Language

SQL-like syntax for expressing prediction tasks on relational data.

Tabular Data vs Graph Data

When flat tables fail and graph structure provides the signal.

Foundation Model

Large pre-trained models that generalize across tasks and domains.

Relational Foundation Model

Foundation models specifically designed for relational database prediction.

Scaling Laws

How GNN performance improves predictably with data, compute, and model size.

Comparisons

8 concepts

Head-to-head comparisons that clarify when to use what. Cut through the confusion with direct, practical comparisons.

GNN vs Transformer

When graph structure helps vs. when full attention is better.

GNN vs CNN

Grids vs. arbitrary topology. Why CNNs are a special case of GNNs.

Attention vs Message Passing

Two paradigms for neighbor communication. Overlap and differences.

Spectral vs Spatial

Frequency domain vs. direct neighbor operations. Theory and practice.

Spectral Methods

Graph convolution defined through eigenvalues of the graph Laplacian.

Spatial Methods

Graph convolution defined through direct neighbor aggregation.

Inductive Learning

Generalizing to unseen nodes and graphs. Required for production.

Transductive Learning

Predictions only on nodes seen during training. Simpler but limited.

Applications

16 concepts

Where graph ML creates real-world value. Each concept maps to specific industries and business outcomes.

Fraud Detection Graphs

Transaction networks that reveal fraud rings invisible to flat models.

Recommendation Graphs

User-item interaction graphs for collaborative filtering at scale.

Drug Discovery Graphs

Molecular graphs where atoms are nodes and bonds are edges.

Social Network Analysis

Understanding influence, communities, and information flow in social graphs.

Traffic Prediction

Road network graphs for real-time traffic flow forecasting.

Supply Chain Graphs

Supplier-manufacturer-distributor networks for risk and optimization.

Knowledge Graph Completion

Predicting missing facts in knowledge bases using link prediction.

Anomaly Detection Graphs

Finding unusual patterns in network structure and node behavior.

Molecular Property Prediction

Predicting solubility, toxicity, and binding affinity from molecular graphs.

Protein Structure

Residue contact graphs for protein folding and function prediction.

Point Cloud Processing

3D point clouds as dynamic graphs for autonomous driving and robotics.

Citation Network

Paper-cites-paper graphs for academic influence and topic modeling.

Social Graph

Friend, follow, and interaction networks for social platform features.

Transaction Graph

Financial transaction networks connecting accounts, merchants, and devices.

Co-Purchase Graph

Products frequently bought together, forming recommendation signals.

Influence Propagation

How information, behavior, and trends spread through network connections.

Production

13 concepts

Scaling, serving, and optimizing GNNs for real-world deployment. The engineering that turns research models into business systems.

Skip the theory. Get predictions in seconds.

KumoRFM applies these concepts automatically. You describe the prediction task in one line of PQL. The model handles the rest.

Try KumoRFM Free