Executive AI Dinner hosted by Kumo - Austin, April 8

Register here
2Regression · Enrollment Forecasting

Enrollment Forecasting

How many students will enroll next semester?

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

Catalina Logo

A real-world example

How many students will enroll next semester?

Enrollment forecasting errors of 5-10% force universities to either over-hire faculty and over-allocate housing (wasting $3-5M) or under-prepare and deliver poor student experiences. Traditional funnel models treat each applicant independently, missing the network effects: when competing institutions change aid packages, when a program's reputation shifts in applicant peer groups, and when marketing campaigns reach connected prospective students. For a university with 5,000 incoming students, a 5% improvement in yield prediction saves $2-4M in misallocated resources.

Quick answer

Enrollment forecasting AI predicts how many admitted students will enroll (yield) per program by analyzing the applicant network: peer group decisions, financial aid competitiveness, marketing engagement, and competitor institution behavior. Traditional funnel models treat each applicant independently and miss network effects that drive yield. Graph-based models predict enrollment within 2% accuracy per program, saving universities $2-4M annually in misallocated faculty, housing, and financial aid resources.

Approaches compared

4 ways to solve this problem

1. Historical Yield Rates

Apply last year's yield percentage to this year's admitted class. The simplest approach and the baseline every institution starts with.

Best for

Stable programs with consistent applicant profiles and minimal competitive pressure year-over-year.

Watch out for

Cannot adapt to changing conditions: new competitor programs, shifts in financial aid strategy, or demographic changes in the applicant pool. Errors of 5-10% are common and compound across programs.

2. Funnel Conversion Models

Track applicants through stages (inquiry, application, admission, deposit, enrollment) and model conversion rates at each stage. More granular than aggregate yield rates.

Best for

Understanding where in the funnel applicants drop off and which stages need attention.

Watch out for

Treats each applicant independently. Cannot capture that when one strong applicant in a peer group commits, it influences others in that group. Also misses how competitor aid offers affect yield across segments.

3. Logistic Regression on Applicant Features

Build a statistical model predicting individual enrollment probability from applicant attributes (GPA, test scores, distance, aid offer). Straightforward and interpretable.

Best for

Programs with well-understood yield drivers and stable applicant demographics.

Watch out for

Static features miss the behavioral signals that drive enrollment decisions: campus visit engagement, email response patterns, peer group momentum. Cannot represent the competitive dynamics between institutions for the same student.

4. Graph Neural Networks (Kumo's Approach)

Connect applicants, programs, demographics, financial aid offers, and marketing touchpoints into an enrollment graph. GNNs learn yield patterns from the applicant network, including peer effects and competitive dynamics.

Best for

Large universities with diverse programs, competitive admissions, and complex financial aid strategies where network effects drive enrollment decisions.

Watch out for

Requires integrated data across admissions CRM, financial aid, and marketing systems. Less value-add for small, non-selective programs where yield is consistently high.

Key metric: Graph-based enrollment models predict yield within 2% per program versus 5-10% for funnel models. The accuracy gap comes from peer group effects and competitive dynamics that account for 30-40% of individual enrollment decisions.

Why relational data changes the answer

Enrollment decisions are social decisions. When a prospective student's high school friend commits to your CS program, the probability that the prospective student also commits increases 35%. When your financial aid offer is $5K below a competitor's for a specific demographic segment, yield drops 12% for that entire segment, not just the students who received competitor offers. When a campus visit day creates strong connections between admitted students, the group's yield rate lifts 20%. None of these dynamics are visible in individual applicant data.

Graph-based enrollment models capture these network effects by representing applicants, their peer groups, their aid offers relative to competitors, and their engagement patterns as a connected system. SAP's SALT benchmark shows graph models achieving 91% accuracy vs 63% for gradient-boosted trees on relational prediction tasks. RelBench shows GNNs at 76.71 vs 62.44 for tree-based models. In enrollment forecasting, this translates to predicting yield within 2% per program versus the 5-10% errors common with funnel models. For a university managing 5,000 incoming students, that accuracy difference is $2-4M in resources allocated to the right programs at the right time.

Predicting enrollment with individual applicant models is like predicting whether someone will attend a party by looking at their invitation status alone. You miss the social dynamics: their best friend is going, the competing event across town got cancelled, and three people from their study group already committed. Enrollment decisions ripple through peer networks. Graph-based models capture these ripples rather than treating each RSVP as an independent event.

How KumoRFM solves this

Graph-powered intelligence for education

Kumo connects applicants, programs, demographics, financial aid offers, and marketing touchpoints into an enrollment graph. The GNN learns yield patterns from the applicant network: how peer group decisions correlate, how aid package competitiveness affects yield by demographic segment, and which marketing sequences drive deposits. PQL predicts enrollment counts per program per semester, with enough lead time for resource planning.

From data to predictions

See the full pipeline in action

Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.

1

Your data

The relational tables Kumo learns from

APPLICANTS

applicant_idprogramstatusgpatest_score
APP001Computer ScienceAdmitted3.81420
APP002NursingAdmitted3.51280
APP003BusinessAdmitted3.61350

PROGRAMS

program_idnamecapacityhistorical_yield
PROG01Computer Science40038%
PROG02Nursing20052%
PROG03Business35041%

DEMOGRAPHICS

applicant_idstateincome_tierfirst_gen
APP001CaliforniaHighNo
APP002TexasMediumYes
APP003New YorkHighNo

FINANCIAL_AID

applicant_idmerit_aidneed_aidtotal_package
APP001$15,000$0$15,000
APP002$8,000$12,000$20,000
APP003$10,000$0$10,000

MARKETING

applicant_idcampus_visitemail_opensevent_attended
APP001Yes12Open House
APP002No4None
APP003Yes8Admitted Students Day
2

Write your PQL query

Describe what to predict in 2–3 lines — Kumo handles the rest

PQL
PREDICT COUNT(APPLICANTS.applicant_id, 0, 90, days)
FOR EACH PROGRAMS.program_id
WHERE APPLICANTS.status = 'Admitted'
3

Prediction output

Every entity gets a score, updated continuously

PROGRAMADMITTEDPREDICTED_ENROLLEDYIELD_RATE
Computer Science1,05041539.5%
Nursing38020553.9%
Business86036242.1%
4

Understand why

Every prediction includes feature attributions — no black boxes

Program: Computer Science -- Fall 2025 enrollment

Predicted: 415 enrolled (39.5% yield, +1.5% vs historical)

Top contributing features

Campus visit rate above average

42% visited

28% attribution

Aid competitiveness vs peer institutions

Above median

24% attribution

Applicant peer group deposit signals

Strong

20% attribution

Program ranking improvement

+5 spots

16% attribution

Marketing engagement (email + events)

High

12% attribution

Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability

Frequently asked questions

Common questions about enrollment forecasting

How accurate can enrollment forecasting AI be?

Graph-based models typically achieve 2-3% accuracy per program for total enrollment, and 70-80% accuracy for individual applicant yield prediction. This compares to 5-10% program-level error for historical yield rates and 4-6% for regression-based models. The accuracy advantage comes from capturing peer group effects and competitive dynamics that individual-level models miss.

When should enrollment forecasting start each cycle?

Model predictions become useful 2-3 months before the deposit deadline, when behavioral signals (campus visits, email engagement, aid offer responses) start accumulating. The model updates continuously as new signals arrive, with accuracy improving each week. The most critical window is 4-6 weeks before the deposit deadline, when final enrollment counts for resource planning need to be locked in.

Can enrollment AI help optimize financial aid allocation?

Yes, and this is often the highest-ROI application. The model predicts yield at different aid levels per applicant segment, enabling aid offices to allocate merit and need-based aid to maximize yield within budget. Many institutions find they can achieve the same or higher yield with 5-10% less aid spend by targeting aid to the segments where it has the greatest impact on enrollment decisions.

How does enrollment forecasting handle new programs with no history?

Graph-based models handle new programs better than historical approaches because they transfer knowledge from similar programs at the same institution and comparable programs at peer institutions. A new data science program inherits yield patterns from existing CS and statistics programs, adjusted for the new program's positioning. Accuracy is lower than mature programs initially (4-5% error vs 2-3%) but reaches full accuracy within 2-3 recruitment cycles.

Does enrollment forecasting AI work for graduate programs?

Yes, though the dynamics differ. Graduate enrollment is less driven by peer effects and more by research fit, funding packages, and faculty reputation. Graph-based models capture faculty-applicant affinity, research area matching, and funding competitiveness signals. For MBA and professional programs, the dynamics are closer to undergraduate (peer effects, brand competition, aid sensitivity).

Bottom line: A university with 5,000 incoming students saves $2-4M by predicting enrollment within 2% accuracy per program. Kumo's enrollment graph captures peer group effects, aid competitiveness, and marketing attribution that funnel models miss.

Topics covered

enrollment forecasting AIadmissions yield predictionstudent enrollment modelhigher education forecastingenrollment management MLKumoRFM enrollmentyield rate predictionadmissions funnel model

One Platform. One Model. Infinite Predictions.

KumoRFM

Relational Foundation Model

Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.

For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.