Executive AI Dinner hosted by Kumo - Austin, April 8

Register here
3Regression · Engagement Forecasting

Engagement Prediction

How many minutes will this user watch today?

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

Catalina Logo

A real-world example

How many minutes will this user watch today?

Engagement is the leading indicator of subscriber health, ad inventory value, and content ROI. Platforms that can predict daily engagement per subscriber can optimize content scheduling, ad load balancing, and proactive retention. A platform with 40M subscribers that increases average daily engagement by 5 minutes generates $180M in additional ad revenue and prevents $30M in churn-related losses annually.

Quick answer

Graph neural networks predict daily viewing minutes per subscriber by learning individual engagement rhythms from session data, content schedules, and behavioral signals. The model captures binge triggers (new season drops), device transitions, and household co-viewing patterns that aggregate models flatten away, enabling platforms to optimize content scheduling and ad load for $180M+ in additional annual revenue.

Approaches compared

4 ways to solve this problem

1. Historical averages (rolling mean per subscriber)

Predict tomorrow's viewing minutes as the trailing 7-day or 30-day average. The simplest baseline.

Best for

Quick baseline for stable viewers with consistent habits. Zero engineering required.

Watch out for

Cannot predict engagement spikes (new season drops, live events) or dips (travel, seasonal changes). Treats every day the same, missing the weekday/weekend and time-of-day patterns that drive most variance.

2. Time-series models (ARIMA, Prophet)

Model engagement as a time series per subscriber with trend, seasonality, and holiday effects.

Best for

Captures weekly and seasonal patterns. Good for aggregate-level forecasting across the platform.

Watch out for

Per-subscriber time series are noisy and sparse. Cannot incorporate external signals (content drops, marketing pushes) or cross-subscriber patterns without significant feature engineering.

3. Gradient boosted trees on session features

Engineer features like 'sessions last week,' 'avg duration per session,' 'content in queue,' and train XGBoost to predict next-day minutes.

Best for

Strong baseline that captures the most predictive behavioral features. Handles mixed feature types well.

Watch out for

Misses the temporal dynamics within sessions (shortening duration, changing device patterns) and cannot model how content schedule changes interact with individual viewing habits.

4. KumoRFM (relational graph ML)

Connect subscribers, sessions, content, and schedules into a temporal graph. The GNN learns individual engagement rhythms and how external events (new releases, live sports) modulate them.

Best for

Subscriber-level daily predictions that account for binge triggers, device transitions, household co-viewing, and content schedule interactions. Highest accuracy for engagement spike prediction.

Watch out for

Requires session-level data with timestamps and content linkage. Adds most value when you have content schedule data and multiple interaction types beyond raw viewing.

Key metric: A 40M-subscriber platform generates $180M in additional annual ad revenue by predicting daily engagement at the subscriber level and optimizing content scheduling accordingly.

Why relational data changes the answer

Engagement is driven by the interplay between subscriber behavior and platform events. Subscriber SUB201 watches 85 minutes on average, but on the day a new season of their followed series drops, they will watch 142 minutes because they historically binge 2.1x their average on release days, they use a Smart TV (longer sessions), and their household co-views on premiere nights. No single-table model captures this because the signal lives across four tables: subscriber history, session patterns, content schedules, and household data.

Relational models connect these tables and learn the conditional engagement patterns per subscriber. They predict not just the average but the spikes and dips that matter for content scheduling and ad revenue decisions. On the RelBench benchmark, relational models score 76.71 vs 62.44 for single-table baselines. In engagement prediction, the gap is especially valuable because the high-engagement days (content drops, live events) are where most ad revenue concentrates, and getting those predictions right has outsized financial impact.

Predicting engagement from average viewing time is like a restaurant estimating tonight's covers from last month's daily average. You miss that tonight is Valentine's Day, your chef just got reviewed in the local paper, and there is a concert at the venue next door. Graph-based engagement prediction reads the calendar, the marketing schedule, and each subscriber's personal history of responding to similar events, producing a prediction that accounts for the full context.

How KumoRFM solves this

Graph-powered intelligence for media platforms

Kumo connects subscribers, sessions, content, and schedules into a temporal graph. The model learns individual engagement rhythms: weekday vs. weekend patterns, binge triggers (new season drops), device transitions throughout the day, and how social viewing signals (household co-watching) amplify engagement. PQL forecasts daily minutes at the subscriber level, enabling personalized scheduling and ad load decisions.

From data to predictions

See the full pipeline in action

Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.

1

Your data

The relational tables Kumo learns from

SUBSCRIBERS

subscriber_idplanavg_daily_minutespreferred_time
SUB201Premium85Evening
SUB202Ad-supported42Afternoon
SUB203Standard110Night

SESSIONS

session_idsubscriber_iddeviceduration_mintimestamp
SES401SUB201Smart TV622025-03-01 20:00
SES402SUB202Mobile182025-03-01 14:30
SES403SUB203Smart TV952025-03-01 22:00

CONTENT

content_idtypegenreavg_engagement_min
SER301SeriesDrama45
MOV401MovieAction110
SER302SeriesComedy28

SCHEDULES

content_idrelease_daterelease_typemarketing_push
SER3012025-03-05New SeasonHeavy
MOV4012025-03-01PremiereStandard
2

Write your PQL query

Describe what to predict in 2–3 lines — Kumo handles the rest

PQL
PREDICT SUM(SESSIONS.duration_min, 0, 1, days)
FOR EACH SUBSCRIBERS.subscriber_id
3

Prediction output

Every entity gets a score, updated continuously

SUBSCRIBER_IDDATEPREDICTED_MINUTESVS_AVG
SUB2012025-03-05142+67%
SUB2022025-03-0538-10%
SUB2032025-03-05125+14%
4

Understand why

Every prediction includes feature attributions — no black boxes

Subscriber SUB201 -- Premium plan, evening viewer

Predicted: 142 predicted minutes on March 5 (+67% vs avg)

Top contributing features

New season drop for followed series

SER301

35% attribution

Historical binge pattern on release days

2.1x avg

25% attribution

Day of week (Wednesday = peak)

Wed

17% attribution

Household co-viewing likelihood

High

13% attribution

Device preference (Smart TV = longer)

Smart TV

10% attribution

Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability

Frequently asked questions

Common questions about engagement prediction

How do you predict daily viewer engagement on a streaming platform?

Connect subscriber session data, content schedules, and behavioral signals in a graph model. The key is modeling engagement at the individual subscriber level, accounting for personal viewing rhythms, content drop triggers, device preferences, and household co-viewing patterns. Aggregate models miss the subscriber-level variance that drives most business decisions.

What drives engagement spikes on streaming platforms?

New season drops, live events, and heavy marketing pushes drive the largest spikes. But the magnitude varies by subscriber: some viewers binge immediately while others spread viewing over weeks. Graph models learn each subscriber's response pattern to these triggers, predicting who will spike and by how much.

How does engagement prediction improve ad revenue?

Knowing which subscribers will have high-engagement sessions tomorrow lets you optimize ad load and scheduling. You can increase ad density during predicted high-engagement sessions (where tolerance is higher) and reduce it during low-engagement sessions (where each ad risks abandonment). A 40M-subscriber platform gains $180M in annual ad revenue from this optimization.

What data do you need for engagement prediction?

Session-level viewing data with timestamps, durations, and content IDs. Content schedule data (release dates, marketing push intensity). Subscriber profiles and device usage patterns. Household membership data if available. Each additional signal type improves the model's ability to predict engagement spikes and dips.

How accurate are engagement prediction models?

Graph-based models predict daily engagement within 15-20% MAE at the subscriber level. Accuracy is highest for subscribers with 30+ days of history and for days with clear engagement drivers (content drops, live events). The business value comes less from average accuracy and more from correctly predicting the high-engagement days where ad revenue concentrates.

Bottom line: A 40M-subscriber platform that predicts daily engagement generates $180M in additional ad revenue by optimizing content scheduling and ad load. Kumo captures individual viewing rhythms, binge triggers, and social signals that aggregate models flatten away.

Topics covered

viewer engagement predictionstreaming engagement AIwatch time forecastingdaily engagement modelmedia engagement MLKumoRFM engagementsession duration predictionviewer behavior forecasting

One Platform. One Model. Infinite Predictions.

KumoRFM

Relational Foundation Model

Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.

For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.