What data is needed for engagement prediction?

Kumo connects directly to your existing relational tables: SUBSCRIBERS, SESSIONS, CONTENT, SCHEDULES. No ETL or feature engineering required. Write a PQL query and get explainable predictions in minutes.

3Regression · Engagement Forecasting

Engagement Prediction

“How many minutes will this user watch today?”

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

A real-world example

How many minutes will this user watch today?

Engagement is the leading indicator of subscriber health, ad inventory value, and content ROI. Platforms that can predict daily engagement per subscriber can optimize content scheduling, ad load balancing, and proactive retention. A platform with 40M subscribers that increases average daily engagement by 5 minutes generates $180M in additional ad revenue and prevents $30M in churn-related losses annually.

Quick answer

Graph neural networks predict daily viewing minutes per subscriber by learning individual engagement rhythms from session data, content schedules, and behavioral signals. The model captures binge triggers (new season drops), device transitions, and household co-viewing patterns that aggregate models flatten away, enabling platforms to optimize content scheduling and ad load for $180M+ in additional annual revenue.

Approaches compared

4 ways to solve this problem

1. Historical averages (rolling mean per subscriber)

Predict tomorrow's viewing minutes as the trailing 7-day or 30-day average. The simplest baseline.

Best for

Quick baseline for stable viewers with consistent habits. Zero engineering required.

Watch out for

Cannot predict engagement spikes (new season drops, live events) or dips (travel, seasonal changes). Treats every day the same, missing the weekday/weekend and time-of-day patterns that drive most variance.

2. Time-series models (ARIMA, Prophet)

Model engagement as a time series per subscriber with trend, seasonality, and holiday effects.

Best for

Captures weekly and seasonal patterns. Good for aggregate-level forecasting across the platform.

Watch out for

Per-subscriber time series are noisy and sparse. Cannot incorporate external signals (content drops, marketing pushes) or cross-subscriber patterns without significant feature engineering.

3. Gradient boosted trees on session features

Engineer features like 'sessions last week,' 'avg duration per session,' 'content in queue,' and train XGBoost to predict next-day minutes.

Best for

Strong baseline that captures the most predictive behavioral features. Handles mixed feature types well.

Watch out for

Misses the temporal dynamics within sessions (shortening duration, changing device patterns) and cannot model how content schedule changes interact with individual viewing habits.

4. KumoRFM (relational graph ML)

Connect subscribers, sessions, content, and schedules into a temporal graph. The GNN learns individual engagement rhythms and how external events (new releases, live sports) modulate them.

Best for

Subscriber-level daily predictions that account for binge triggers, device transitions, household co-viewing, and content schedule interactions. Highest accuracy for engagement spike prediction.

Watch out for

Requires session-level data with timestamps and content linkage. Adds most value when you have content schedule data and multiple interaction types beyond raw viewing.

Key metric: A 40M-subscriber platform generates $180M in additional annual ad revenue by predicting daily engagement at the subscriber level and optimizing content scheduling accordingly.

Why relational data changes the answer

Engagement is driven by the interplay between subscriber behavior and platform events. Subscriber SUB201 watches 85 minutes on average, but on the day a new season of their followed series drops, they will watch 142 minutes because they historically binge 2.1x their average on release days, they use a Smart TV (longer sessions), and their household co-views on premiere nights. No single-table model captures this because the signal lives across four tables: subscriber history, session patterns, content schedules, and household data.

Relational models connect these tables and learn the conditional engagement patterns per subscriber. They predict not just the average but the spikes and dips that matter for content scheduling and ad revenue decisions. On the RelBench benchmark, relational models score 76.71 vs 62.44 for single-table baselines. In engagement prediction, the gap is especially valuable because the high-engagement days (content drops, live events) are where most ad revenue concentrates, and getting those predictions right has outsized financial impact.

Predicting engagement from average viewing time is like a restaurant estimating tonight's covers from last month's daily average. You miss that tonight is Valentine's Day, your chef just got reviewed in the local paper, and there is a concert at the venue next door. Graph-based engagement prediction reads the calendar, the marketing schedule, and each subscriber's personal history of responding to similar events, producing a prediction that accounts for the full context.

How KumoRFM solves this

Graph-powered intelligence for media platforms

Kumo connects subscribers, sessions, content, and schedules into a temporal graph. The model learns individual engagement rhythms: weekday vs. weekend patterns, binge triggers (new season drops), device transitions throughout the day, and how social viewing signals (household co-watching) amplify engagement. PQL forecasts daily minutes at the subscriber level, enabling personalized scheduling and ad load decisions.

From data to predictions

See the full pipeline in action

Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.

Your data

The relational tables Kumo learns from

SUBSCRIBERS

subscriber_id	plan	avg_daily_minutes	preferred_time
SUB201	Premium	85	Evening
SUB202	Ad-supported	42	Afternoon
SUB203	Standard	110	Night

SESSIONS

session_id	subscriber_id	device	duration_min	timestamp
SES401	SUB201	Smart TV	62	2025-03-01 20:00
SES402	SUB202	Mobile	18	2025-03-01 14:30
SES403	SUB203	Smart TV	95	2025-03-01 22:00

CONTENT

content_id	type	genre	avg_engagement_min
SER301	Series	Drama	45
MOV401	Movie	Action	110
SER302	Series	Comedy	28

SCHEDULES

content_id	release_date	release_type	marketing_push
SER301	2025-03-05	New Season	Heavy
MOV401	2025-03-01	Premiere	Standard

Write your PQL query

Describe what to predict in 2–3 lines — Kumo handles the rest

PQL

PREDICT SUM(SESSIONS.duration_min, 0, 1, days)
FOR EACH SUBSCRIBERS.subscriber_id

Prediction output

Every entity gets a score, updated continuously

SUBSCRIBER_ID	DATE	PREDICTED_MINUTES	VS_AVG
SUB201	2025-03-05	142	+67%
SUB202	2025-03-05	38	-10%
SUB203	2025-03-05	125	+14%

Understand why

Every prediction includes feature attributions — no black boxes

Subscriber SUB201 -- Premium plan, evening viewer

Predicted: 142 predicted minutes on March 5 (+67% vs avg)

Top contributing features

New season drop for followed series

SER301

35% attribution

Historical binge pattern on release days

2.1x avg

25% attribution

Day of week (Wednesday = peak)

Wed

17% attribution

Household co-viewing likelihood

High

13% attribution

Device preference (Smart TV = longer)

Smart TV

10% attribution

Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability

PQL Documentation

Learn the Predictive Query Language — SQL-like syntax for defining any prediction task in 2–3 lines.

Read docs

Python SDK

Integrate Kumo predictions into your pipelines. Train, evaluate, and deploy models programmatically.

Read docs

Explainability Docs

Understand feature attributions, model evaluation metrics, and how to build trust with stakeholders.

Read docs

Frequently asked questions

Common questions about engagement prediction

How do you predict daily viewer engagement on a streaming platform?

Connect subscriber session data, content schedules, and behavioral signals in a graph model. The key is modeling engagement at the individual subscriber level, accounting for personal viewing rhythms, content drop triggers, device preferences, and household co-viewing patterns. Aggregate models miss the subscriber-level variance that drives most business decisions.

What drives engagement spikes on streaming platforms?

New season drops, live events, and heavy marketing pushes drive the largest spikes. But the magnitude varies by subscriber: some viewers binge immediately while others spread viewing over weeks. Graph models learn each subscriber's response pattern to these triggers, predicting who will spike and by how much.

How does engagement prediction improve ad revenue?

Knowing which subscribers will have high-engagement sessions tomorrow lets you optimize ad load and scheduling. You can increase ad density during predicted high-engagement sessions (where tolerance is higher) and reduce it during low-engagement sessions (where each ad risks abandonment). A 40M-subscriber platform gains $180M in annual ad revenue from this optimization.

What data do you need for engagement prediction?

Session-level viewing data with timestamps, durations, and content IDs. Content schedule data (release dates, marketing push intensity). Subscriber profiles and device usage patterns. Household membership data if available. Each additional signal type improves the model's ability to predict engagement spikes and dips.

How accurate are engagement prediction models?

Graph-based models predict daily engagement within 15-20% MAE at the subscriber level. Accuracy is highest for subscribers with 30+ days of history and for days with clear engagement drivers (content drops, live events). The business value comes less from average accuracy and more from correctly predicting the high-engagement days where ad revenue concentrates.

Bottom line: A 40M-subscriber platform that predicts daily engagement generates $180M in additional ad revenue by optimizing content scheduling and ad load. Kumo captures individual viewing rhythms, binge triggers, and social signals that aggregate models flatten away.

Related use cases

Explore more media & entertainment use cases

Use Case #1Content RecommendationsLearn more

Use Case #2Subscriber ChurnLearn more

Use Case #5Ad Revenue OptimizationLearn more

Previous#2 Subscriber Churn

Next#4 Content Demand Forecasting

Topics covered

viewer engagement predictionstreaming engagement AIwatch time forecastingdaily engagement modelmedia engagement MLKumoRFM engagementsession duration predictionviewer behavior forecasting

From a leadership team with proven experience

Vanja Josifovski

CEO and Co-Founder, ex-CTO Airbnb, ex-CTO Pinterest

Jure Leskovec

Co-Founder & Chief Scientist, Stanford Professor

Hema Raghavan

Co-Founder & Head of Engineering, ex-AI Lead, LinkedIn

One Platform. One Model. Infinite Predictions.

KumoRFM

Relational Foundation Model

Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.

For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

Book a Demo Try Free