Berlin Tech Meetup: The Future of Relational Foundation Models, Systems, and Real-World Applications
Every concept in graph machine learning, explained with enterprise data examples. From message passing to graph transformers, from over-smoothing to relational deep learning.
120+
Concepts
Yes
Agent-Optimized
Every Page
Enterprise Examples
The fundamental operations that make GNNs work. How information flows between nodes, how neighbors are aggregated, and how graph structure drives learning.
The foundation of all GNNs. Nodes send and receive messages along edges.
How a node combines information from its neighbors into a single vector.
Spectral and spatial filtering that generalizes CNNs to irregular graphs.
Learning which neighbors matter more through trainable attention weights.
Reducing node sets to graph-level representations for classification tasks.
Aggregating all node features into a single graph-level embedding.
Progressively coarsening graphs layer by layer, preserving structure.
Incorporating edge attributes into message passing for richer representations.
Residual links that prevent over-smoothing in deep GNN architectures.
Spreading node features through the graph before or during training.
Semi-supervised technique that spreads known labels through graph structure.
Methods for representing nodes and graphs as dense vectors. From classical random walk approaches to modern positional encodings.
Mapping each node to a low-dimensional vector that captures its role in the graph.
Biased random walks that balance between BFS and DFS for flexible embeddings.
Stochastic traversals that capture local and global graph structure.
When lookup tables beat neural networks, and when they don't.
Giving GNNs a sense of where nodes sit in the overall graph structure.
Capturing local structural roles like hubs, bridges, and periphery nodes.
Not all graphs are created equal. Enterprise data involves multiple node types, temporal edges, directed relationships, and more.
Single node type, single edge type. The simplest graph structure.
Multiple node and edge types. How real enterprise data actually looks.
Edges and nodes that change over time. Critical for fraud and churn.
Two disjoint node sets with edges only between them. Users and products.
Entity-relation-entity triples encoding structured world knowledge.
Graphs where topology evolves. New nodes appear, edges form and dissolve.
Asymmetric edges where direction carries meaning. Payments, follows, citations.
Positive and negative edges representing trust/distrust or agree/disagree.
Edges that connect more than two nodes. Group interactions and co-authorship.
The prediction problems you can solve with graph ML. Each maps directly to business outcomes like fraud scores, product recommendations, and customer segments.
Predicting a label for each node. Fraud/not-fraud, churn/retain.
Predicting whether an edge will form. Recommendations, knowledge graph completion.
Classifying entire graphs. Molecular toxicity, document categorization.
Predicting a continuous value per node. Customer LTV, credit score.
Finding clusters of densely connected nodes. Customer segments, fraud rings.
Creating new graphs that match a learned distribution. Drug design, network synthesis.
Determining which nodes refer to the same real-world entity.
Testing whether two graphs have identical structure. Expressiveness benchmark.
Model designs beyond basic message passing. Graph transformers, autoencoders, and specialized architectures for different data types.
Full attention over graph nodes. Long-range dependencies without over-squashing.
Transformer architecture designed for multi-relation heterogeneous graphs.
Learning neighbor importance through attention coefficients.
Separate encoders for users and items, combined via dot product. Scalable recommendations.
Encode graph structure into latent space and reconstruct. Unsupervised graph learning.
Learning mappings between function spaces on graphs for physics simulations.
Architectures that respect geometric symmetries. Rotations, translations, reflections.
How to train GNNs effectively. Self-supervised pre-training, contrastive methods, temporal sampling strategies, and avoiding data leakage.
Learning representations from graph structure without labeled data.
Learning by pulling similar pairs together and pushing dissimilar pairs apart.
Edge dropping, feature masking, and subgraph sampling for robust training.
Applying knowledge from one graph domain to another.
Large-scale unsupervised training before task-specific fine-tuning.
Adapting a pre-trained model to a specific downstream task with limited labels.
Making predictions with only a handful of labeled examples.
Predicting on unseen classes or domains without any task-specific training.
Conditioning predictions on examples provided at inference time.
Generating non-edges for link prediction training. Strategy matters enormously.
Masking node or edge features and predicting them. BERT-style pre-training for graphs.
Training on multiple objectives simultaneously for shared representations.
Training on easy examples first, gradually increasing difficulty.
Handling skewed label distributions. Critical for fraud where positives are 0.1%.
Sampling neighbors respecting time order to prevent information leakage.
When future information leaks into training. The silent killer of graph ML projects.
Train/val/test splits based on time, not random. Essential for production validity.
The hard problems in graph ML. Understanding these separates production-ready practitioners from notebook experimenters.
Deep GNNs make all node representations converge. Why 2-3 layers is often optimal.
Exponential compression of long-range information through graph bottlenecks.
What structural patterns a GNN can and cannot distinguish.
The graph isomorphism test that defines the upper bound of GNN expressiveness.
Connected nodes tend to be similar. Most GNNs assume this.
Connected nodes are different. Requires specialized architectures.
New nodes with no edges. How to make predictions without graph context.
When edges are rare relative to possible connections. Most real graphs.
Power-law vs. uniform. Hub nodes dominate message passing.
Concepts specific to applying graph ML on enterprise relational databases. The bridge between academic GNNs and production data systems.
End-to-end learning directly on relational database schemas.
Every relational DB is a graph. Tables are node types, foreign keys are edges.
Automatically constructing graph topology from database foreign key constraints.
Why GNNs replace months of manual feature engineering with learned representations.
Making predictions that leverage information across all connected tables.
Combining multiple node/edge types with temporal dynamics. Enterprise reality.
Encoding any database schema without task-specific feature engineering.
SQL-like syntax for expressing prediction tasks on relational data.
When flat tables fail and graph structure provides the signal.
Large pre-trained models that generalize across tasks and domains.
Foundation models specifically designed for relational database prediction.
How GNN performance improves predictably with data, compute, and model size.
Head-to-head comparisons that clarify when to use what. Cut through the confusion with direct, practical comparisons.
When graph structure helps vs. when full attention is better.
Grids vs. arbitrary topology. Why CNNs are a special case of GNNs.
Two paradigms for neighbor communication. Overlap and differences.
Frequency domain vs. direct neighbor operations. Theory and practice.
Graph convolution defined through eigenvalues of the graph Laplacian.
Graph convolution defined through direct neighbor aggregation.
Generalizing to unseen nodes and graphs. Required for production.
Predictions only on nodes seen during training. Simpler but limited.
Where graph ML creates real-world value. Each concept maps to specific industries and business outcomes.
Transaction networks that reveal fraud rings invisible to flat models.
User-item interaction graphs for collaborative filtering at scale.
Molecular graphs where atoms are nodes and bonds are edges.
Understanding influence, communities, and information flow in social graphs.
Road network graphs for real-time traffic flow forecasting.
Supplier-manufacturer-distributor networks for risk and optimization.
Predicting missing facts in knowledge bases using link prediction.
Finding unusual patterns in network structure and node behavior.
Predicting solubility, toxicity, and binding affinity from molecular graphs.
Residue contact graphs for protein folding and function prediction.
3D point clouds as dynamic graphs for autonomous driving and robotics.
Paper-cites-paper graphs for academic influence and topic modeling.
Friend, follow, and interaction networks for social platform features.
Financial transaction networks connecting accounts, merchants, and devices.
Products frequently bought together, forming recommendation signals.
How information, behavior, and trends spread through network connections.
Scaling, serving, and optimizing GNNs for real-world deployment. The engineering that turns research models into business systems.
Training on subsets of large graphs that don't fit in memory.
Sampling fixed-size neighborhoods for scalable training on large graphs.
Extracting meaningful subgraphs for distributed and parallel training.
Splitting large graphs across machines while minimizing cross-partition edges.
Adding or removing edges to improve information flow and reduce bottlenecks.
Creating smaller graphs that preserve structural properties of the original.
Reconstructing fine-grained graphs from coarsened representations.
Understanding why a GNN made a specific prediction. Subgraph explanations.
Standard datasets and metrics for comparing GNN performance fairly.
Serving GNN predictions at low latency in production environments.
Compressing large GNNs into smaller, faster models for deployment.
Normalizing node features between GNN layers for stable training.
Regularization strategies adapted for graph-structured data.
KumoRFM applies these concepts automatically. You describe the prediction task in one line of PQL. The model handles the rest.