What data is needed for enrollment forecasting?

Kumo connects directly to your existing relational tables: APPLICANTS, PROGRAMS, DEMOGRAPHICS, FINANCIAL_AID, MARKETING. No ETL or feature engineering required. Write a PQL query and get explainable predictions in minutes.

2Regression · Enrollment Forecasting

Enrollment Forecasting

“How many students will enroll next semester?”

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

A real-world example

How many students will enroll next semester?

Enrollment forecasting errors of 5-10% force universities to either over-hire faculty and over-allocate housing (wasting $3-5M) or under-prepare and deliver poor student experiences. Traditional funnel models treat each applicant independently, missing the network effects: when competing institutions change aid packages, when a program's reputation shifts in applicant peer groups, and when marketing campaigns reach connected prospective students. For a university with 5,000 incoming students, a 5% improvement in yield prediction saves $2-4M in misallocated resources.

Quick answer

Enrollment forecasting AI predicts how many admitted students will enroll (yield) per program by analyzing the applicant network: peer group decisions, financial aid competitiveness, marketing engagement, and competitor institution behavior. Traditional funnel models treat each applicant independently and miss network effects that drive yield. Graph-based models predict enrollment within 2% accuracy per program, saving universities $2-4M annually in misallocated faculty, housing, and financial aid resources.

Approaches compared

4 ways to solve this problem

1. Historical Yield Rates

Apply last year's yield percentage to this year's admitted class. The simplest approach and the baseline every institution starts with.

Best for

Stable programs with consistent applicant profiles and minimal competitive pressure year-over-year.

Watch out for

Cannot adapt to changing conditions: new competitor programs, shifts in financial aid strategy, or demographic changes in the applicant pool. Errors of 5-10% are common and compound across programs.

2. Funnel Conversion Models

Track applicants through stages (inquiry, application, admission, deposit, enrollment) and model conversion rates at each stage. More granular than aggregate yield rates.

Best for

Understanding where in the funnel applicants drop off and which stages need attention.

Watch out for

Treats each applicant independently. Cannot capture that when one strong applicant in a peer group commits, it influences others in that group. Also misses how competitor aid offers affect yield across segments.

3. Logistic Regression on Applicant Features

Build a statistical model predicting individual enrollment probability from applicant attributes (GPA, test scores, distance, aid offer). Straightforward and interpretable.

Best for

Programs with well-understood yield drivers and stable applicant demographics.

Watch out for

Static features miss the behavioral signals that drive enrollment decisions: campus visit engagement, email response patterns, peer group momentum. Cannot represent the competitive dynamics between institutions for the same student.

4. Graph Neural Networks (Kumo's Approach)

Connect applicants, programs, demographics, financial aid offers, and marketing touchpoints into an enrollment graph. GNNs learn yield patterns from the applicant network, including peer effects and competitive dynamics.

Best for

Large universities with diverse programs, competitive admissions, and complex financial aid strategies where network effects drive enrollment decisions.

Watch out for

Requires integrated data across admissions CRM, financial aid, and marketing systems. Less value-add for small, non-selective programs where yield is consistently high.

Key metric: Graph-based enrollment models predict yield within 2% per program versus 5-10% for funnel models. The accuracy gap comes from peer group effects and competitive dynamics that account for 30-40% of individual enrollment decisions.

Why relational data changes the answer

Enrollment decisions are social decisions. When a prospective student's high school friend commits to your CS program, the probability that the prospective student also commits increases 35%. When your financial aid offer is $5K below a competitor's for a specific demographic segment, yield drops 12% for that entire segment, not just the students who received competitor offers. When a campus visit day creates strong connections between admitted students, the group's yield rate lifts 20%. None of these dynamics are visible in individual applicant data.

Graph-based enrollment models capture these network effects by representing applicants, their peer groups, their aid offers relative to competitors, and their engagement patterns as a connected system. SAP's SALT benchmark shows graph models achieving 91% accuracy vs 63% for gradient-boosted trees on relational prediction tasks. RelBench shows GNNs at 76.71 vs 62.44 for tree-based models. In enrollment forecasting, this translates to predicting yield within 2% per program versus the 5-10% errors common with funnel models. For a university managing 5,000 incoming students, that accuracy difference is $2-4M in resources allocated to the right programs at the right time.

Predicting enrollment with individual applicant models is like predicting whether someone will attend a party by looking at their invitation status alone. You miss the social dynamics: their best friend is going, the competing event across town got cancelled, and three people from their study group already committed. Enrollment decisions ripple through peer networks. Graph-based models capture these ripples rather than treating each RSVP as an independent event.

How KumoRFM solves this

Graph-powered intelligence for education

Kumo connects applicants, programs, demographics, financial aid offers, and marketing touchpoints into an enrollment graph. The GNN learns yield patterns from the applicant network: how peer group decisions correlate, how aid package competitiveness affects yield by demographic segment, and which marketing sequences drive deposits. PQL predicts enrollment counts per program per semester, with enough lead time for resource planning.

From data to predictions

See the full pipeline in action

Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.

Your data

The relational tables Kumo learns from

APPLICANTS

applicant_id	program	status	gpa	test_score
APP001	Computer Science	Admitted	3.8	1420
APP002	Nursing	Admitted	3.5	1280
APP003	Business	Admitted	3.6	1350

PROGRAMS

program_id	name	capacity	historical_yield
PROG01	Computer Science	400	38%
PROG02	Nursing	200	52%
PROG03	Business	350	41%

DEMOGRAPHICS

applicant_id	state	income_tier	first_gen
APP001	California	High	No
APP002	Texas	Medium	Yes
APP003	New York	High	No

FINANCIAL_AID

applicant_id	merit_aid	need_aid	total_package
APP001	$15,000	$0	$15,000
APP002	$8,000	$12,000	$20,000
APP003	$10,000	$0	$10,000

MARKETING

applicant_id	campus_visit	email_opens	event_attended
APP001	Yes	12	Open House
APP002	No	4	None
APP003	Yes	8	Admitted Students Day

Write your PQL query

Describe what to predict in 2–3 lines — Kumo handles the rest

PQL

PREDICT COUNT(APPLICANTS.applicant_id, 0, 90, days)
FOR EACH PROGRAMS.program_id
WHERE APPLICANTS.status = 'Admitted'

Prediction output

Every entity gets a score, updated continuously

PROGRAM	ADMITTED	PREDICTED_ENROLLED	YIELD_RATE
Computer Science	1,050	415	39.5%
Nursing	380	205	53.9%
Business	860	362	42.1%

Understand why

Every prediction includes feature attributions — no black boxes

Program: Computer Science -- Fall 2025 enrollment

Predicted: 415 enrolled (39.5% yield, +1.5% vs historical)

Top contributing features

Campus visit rate above average

42% visited

28% attribution

Aid competitiveness vs peer institutions

Above median

24% attribution

Applicant peer group deposit signals

Strong

20% attribution

Program ranking improvement

+5 spots

16% attribution

Marketing engagement (email + events)

High

12% attribution

Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability

PQL Documentation

Learn the Predictive Query Language — SQL-like syntax for defining any prediction task in 2–3 lines.

Read docs

Python SDK

Integrate Kumo predictions into your pipelines. Train, evaluate, and deploy models programmatically.

Read docs

Explainability Docs

Understand feature attributions, model evaluation metrics, and how to build trust with stakeholders.

Read docs

Frequently asked questions

Common questions about enrollment forecasting

How accurate can enrollment forecasting AI be?

Graph-based models typically achieve 2-3% accuracy per program for total enrollment, and 70-80% accuracy for individual applicant yield prediction. This compares to 5-10% program-level error for historical yield rates and 4-6% for regression-based models. The accuracy advantage comes from capturing peer group effects and competitive dynamics that individual-level models miss.

When should enrollment forecasting start each cycle?

Model predictions become useful 2-3 months before the deposit deadline, when behavioral signals (campus visits, email engagement, aid offer responses) start accumulating. The model updates continuously as new signals arrive, with accuracy improving each week. The most critical window is 4-6 weeks before the deposit deadline, when final enrollment counts for resource planning need to be locked in.

Can enrollment AI help optimize financial aid allocation?

Yes, and this is often the highest-ROI application. The model predicts yield at different aid levels per applicant segment, enabling aid offices to allocate merit and need-based aid to maximize yield within budget. Many institutions find they can achieve the same or higher yield with 5-10% less aid spend by targeting aid to the segments where it has the greatest impact on enrollment decisions.

How does enrollment forecasting handle new programs with no history?

Graph-based models handle new programs better than historical approaches because they transfer knowledge from similar programs at the same institution and comparable programs at peer institutions. A new data science program inherits yield patterns from existing CS and statistics programs, adjusted for the new program's positioning. Accuracy is lower than mature programs initially (4-5% error vs 2-3%) but reaches full accuracy within 2-3 recruitment cycles.

Does enrollment forecasting AI work for graduate programs?

Yes, though the dynamics differ. Graduate enrollment is less driven by peer effects and more by research fit, funding packages, and faculty reputation. Graph-based models capture faculty-applicant affinity, research area matching, and funding competitiveness signals. For MBA and professional programs, the dynamics are closer to undergraduate (peer effects, brand competition, aid sensitivity).

Bottom line: A university with 5,000 incoming students saves $2-4M by predicting enrollment within 2% accuracy per program. Kumo's enrollment graph captures peer group effects, aid competitiveness, and marketing attribution that funnel models miss.

Related use cases

Explore more education use cases

Use Case #1Student RetentionLearn more

Use Case #3Course RecommendationsLearn more

Use Case #4Intervention TargetingLearn more

Previous#1 Student Retention Prediction

Next#3 Course Recommendations

Topics covered

enrollment forecasting AIadmissions yield predictionstudent enrollment modelhigher education forecastingenrollment management MLKumoRFM enrollmentyield rate predictionadmissions funnel model

From a leadership team with proven experience

Vanja Josifovski

CEO and Co-Founder, ex-CTO Airbnb, ex-CTO Pinterest

Jure Leskovec

Co-Founder & Chief Scientist, Stanford Professor

Hema Raghavan

Co-Founder & Head of Engineering, ex-AI Lead, LinkedIn

One Platform. One Model. Infinite Predictions.

KumoRFM

Relational Foundation Model

Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.

For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

Book a Demo Try Free