What data is needed for booking prediction?

Kumo connects directly to your existing relational tables: USERS, SEARCHES, VIEWS, BOOKINGS, PROPERTIES. No ETL or feature engineering required. Write a PQL query and get explainable predictions in minutes.

2Binary Classification · Booking Prediction

Booking Prediction

“Will this browsing session result in a booking?”

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

A real-world example

Will this browsing session result in a booking?

Online travel platforms convert only 2-4% of sessions into bookings. The 96-98% that don't convert represent a massive opportunity: even moving conversion from 3% to 3.5% is a 17% revenue increase. Traditional models use session-level features but miss the user journey graph: how search patterns evolve across sessions, how price sensitivity varies by trip context, and how property-user affinity signals predict intent. For an OTA with $5B in gross bookings, a 0.5% conversion lift generates $83M in incremental revenue.

Quick answer

Booking prediction AI identifies which browsing sessions have high conversion intent by analyzing the full user journey graph: search refinement patterns, engagement depth with property pages, user-property affinity, and loyalty history. Traditional conversion models use session-level features and miss the cross-session journey signals that predict intent. Graph-based models detect high-intent sessions in real time, enabling targeted interventions (urgency messaging, personalized offers) that lift conversion 0.5+ percentage points, generating $83M for an OTA with $5B in gross bookings.

Approaches compared

4 ways to solve this problem

1. Session-Level Heuristics

Flag sessions based on simple rules: viewed 3+ properties, spent 5+ minutes, reached the checkout page. Basic intent scoring available in most analytics platforms.

Best for

Quick-win conversion optimization when you have no ML infrastructure and need immediate signals.

Watch out for

Captures only the most obvious high-intent signals. Many high-intent users browse efficiently (few page views but high engagement per page), while low-intent users browse extensively (many views, low engagement each). Rules cannot distinguish these patterns.

2. Logistic Regression on Session Features

Build a conversion model using session features: page views, time on site, search filters used, device type. The standard approach for conversion optimization.

Best for

Platforms with clean session tracking and well-defined conversion funnels.

Watch out for

Session-level features miss the user journey context. A returning Gold loyalty member browsing Miami hotels for the 3rd time in a week has very different intent than a first-time visitor doing the same search. Also cannot capture user-property affinity: this user always books 4-star beach resorts at $250-350.

3. Deep Learning on Click Sequences (LSTM/Transformer)

Model the click sequence within a session as a time series and predict conversion from the evolving interaction pattern. Captures temporal dynamics better than flat features.

Best for

Platforms with rich clickstream data where the sequence of actions (search, filter, view, compare, view again) carries intent information.

Watch out for

Processes each session independently without user history context. Cannot represent that this user has been researching this trip across 5 sessions over 2 weeks, narrowing from 'beach vacation' to 'Miami, March 14-17, 4-star, ocean view.' The cross-session journey is the strongest intent signal, and sequence models within a single session miss it.

4. Graph Neural Networks (Kumo's Approach)

Connect users, searches, property views, bookings, and property attributes into a travel graph. GNNs learn booking intent from the full user journey, including cross-session patterns, user-property affinity, and loyalty context.

Best for

OTAs and hotel booking platforms with returning users, loyalty programs, and rich property catalog data where user-property affinity drives conversion.

Watch out for

Requires user identity across sessions (login or cookie matching). Anonymous first-time visitors with no history benefit less from the graph approach. Best value for platforms with 30%+ returning user traffic.

Key metric: Graph-based booking prediction identifies 2-3x more high-intent sessions than session-level models. A 0.5 percentage point conversion lift generates $83M for an OTA with $5B in gross bookings, a 17% relative revenue increase.

Why relational data changes the answer

Booking intent is built across a journey, not within a single session. User USR001 (Gold loyalty, 12 past bookings, $340 average) searching Miami hotels and spending 180 seconds on Ocean Breeze Resort viewing 8 photos is a very different signal than User USR002 (no loyalty, 0 bookings) spending 22 seconds on one property. The intent signal comes from the relational context: this user's booking history, their loyalty status, how this search compares to their typical booking pattern, and how deeply they engage with this specific property relative to their usual behavior.

Session-level models see two browsing sessions. Graph-based models see two nodes in a user-property-booking network, each with rich relational context that predicts intent with 80%+ accuracy for high-confidence segments. SAP's SALT benchmark shows graph models at 91% accuracy vs 63% for gradient-boosted trees on relational tasks. RelBench shows 76.71 vs 62.44 for GNNs. In booking prediction, this translates to identifying 2-3x more high-intent sessions, enabling targeted interventions (best-match recommendations, urgency messaging, loyalty-exclusive rates) that convert sessions that would otherwise bounce. A 0.5 percentage point lift on a 3% baseline is a 17% relative improvement in revenue.

Predicting booking intent from a single session is like a car salesperson judging a customer's intent from one showroom visit. They would miss that this customer has visited 4 dealerships this week, test-drove the same model at two competitors, and already has financing pre-approved. The purchase intent is built across the full journey, not visible in any single interaction. Graph-based booking prediction sees the full journey.

How KumoRFM solves this

Graph-powered intelligence for travel and hospitality

Kumo connects users, searches, property views, bookings, and property attributes into a travel graph. The GNN learns booking intent from the full user journey: how search refinement patterns signal high intent, how price sensitivity interacts with property attributes, and which user-property pairings have the highest conversion probability. PQL predicts booking probability per session, enabling real-time personalization and targeted incentives for high-intent sessions.

From data to predictions

See the full pipeline in action

Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.

Your data

The relational tables Kumo learns from

USERS

user_id	loyalty_tier	past_bookings	avg_booking_value
USR001	Gold	12	$340
USR002	None	0	N/A
USR003	Silver	5	$220

SEARCHES

search_id	user_id	destination	dates	guests	timestamp
SRC401	USR001	Miami	Mar 14-17	2	2025-03-01 10:00
SRC402	USR002	NYC	Apr 5-7	1	2025-03-01 11:30
SRC403	USR003	Miami	Mar 14-16	2	2025-03-01 14:00

VIEWS

view_id	search_id	property_id	time_on_page_s	photos_viewed
VW601	SRC401	HTL001	180	8
VW602	SRC401	HTL002	45	2
VW603	SRC402	HTL003	22	1

BOOKINGS

booking_id	user_id	property_id	total	timestamp
BK6001	USR001	HTL001	$1,020	2025-03-01 10:25

PROPERTIES

property_id	name	star_rating	avg_rate	review_score
HTL001	Ocean Breeze Resort	4-star	$295	4.6
HTL002	City Center Hotel	3-star	$185	4.2
HTL003	Manhattan Suites	4-star	$380	4.4

Write your PQL query

Describe what to predict in 2–3 lines — Kumo handles the rest

PQL

PREDICT BOOL(BOOKINGS.booking_id, 0, 1, hours)
FOR EACH SEARCHES.search_id

Prediction output

Every entity gets a score, updated continuously

SEARCH_ID	USER_ID	DESTINATION	BOOKING_PROB	RECOMMENDED_ACTION
SRC401	USR001	Miami	0.82	Show best match
SRC402	USR002	NYC	0.09	Offer discount
SRC403	USR003	Miami	0.44	Show urgency

Understand why

Every prediction includes feature attributions — no black boxes

Search SRC401 -- User USR001 searching Miami hotels

Predicted: 82% booking probability

Top contributing features

Detailed property review (180s + 8 photos)

High engagement

30% attribution

Loyalty tier and booking history

Gold, 12 past bookings

24% attribution

Date proximity (13 days out = committed)

Mar 14-17

19% attribution

Price alignment with avg booking value

$295 vs $340 avg

16% attribution

Search refinement pattern (narrowing)

2 destination searches

11% attribution

Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability

PQL Documentation

Learn the Predictive Query Language — SQL-like syntax for defining any prediction task in 2–3 lines.

Read docs

Python SDK

Integrate Kumo predictions into your pipelines. Train, evaluate, and deploy models programmatically.

Read docs

Explainability Docs

Understand feature attributions, model evaluation metrics, and how to build trust with stakeholders.

Read docs

Frequently asked questions

Common questions about booking prediction

What is a good conversion rate for travel booking platforms?

Industry averages are 2-4% for OTAs, 3-6% for hotel direct booking sites, and 1-2% for metasearch. These rates have been remarkably stable for a decade despite massive investment in UX optimization, because the bottleneck is not the funnel but intent detection. Most platforms treat all sessions equally rather than concentrating conversion efforts on the 15-20% of sessions with genuine booking intent. Graph-based models identify these high-intent sessions and enable differentiated treatment.

How does booking prediction improve revenue without offering discounts?

The primary levers are: showing the best-matching property first (reducing search friction for high-intent users), displaying urgency signals ('only 2 rooms left at this rate') for users near the booking threshold, and reducing friction (pre-filling forms, offering one-click booking for loyalty members). These interventions do not reduce price. They reduce the effort and uncertainty that prevent high-intent users from completing their booking. Discounts are a last resort for moderate-intent users where a small price incentive tips the balance.

Can booking prediction work for first-time visitors with no history?

Partially. For anonymous first-time visitors, the model relies on within-session signals: search specificity (exact dates vs. flexible), engagement depth (time on page, photos viewed), and property-level conversion patterns (this hotel converts 12% of viewers vs. 3% for that hotel). Accuracy for first-time visitors is lower (55-65% vs 80%+ for returning users), but still valuable for the 70%+ of sessions that are anonymous. The gap narrows if you can match the user to a known device or email.

How does booking prediction integrate with real-time personalization?

The model outputs a booking probability score per session that updates in real time as the user interacts. This score drives personalization rules: sessions above 70% probability see best-match recommendations and streamlined checkout. Sessions at 30-50% see social proof ('42 people booked this hotel today') and urgency signals. Sessions below 20% see broader discovery experiences to build intent. The score is consumed by the personalization engine via API with sub-100ms latency.

What is the ROI of improving travel conversion rates?

For an OTA with $5B in gross bookings at 3% conversion, each 0.1 percentage point improvement generates $16.7M in incremental revenue. Graph-based models typically deliver 0.3-0.7 percentage point improvement, or $50-117M. Implementation costs are $1-3M including data integration and model development. The ROI is 20-100x, making conversion optimization one of the highest-return AI investments in travel. The revenue scales linearly with platform size.

Bottom line: An OTA with $5B in gross bookings generates $83M in incremental revenue by improving conversion 0.5 percentage points. Kumo's travel graph detects high-intent sessions from engagement depth, search refinement patterns, and user-property affinity signals.

Related use cases

Explore more travel & hospitality use cases

Use Case #1Dynamic PricingLearn more

Use Case #3Guest PersonalizationLearn more

Use Case #4Cancellation PredictionLearn more

Previous#1 Dynamic Pricing

Next#3 Guest Personalization

Topics covered

booking prediction AItravel conversion predictionsession-to-booking modelOTA conversion optimizationhospitality booking MLKumoRFM travelbooking funnel predictionsearch-to-book prediction

From a leadership team with proven experience

Vanja Josifovski

CEO and Co-Founder, ex-CTO Airbnb, ex-CTO Pinterest

Jure Leskovec

Co-Founder & Chief Scientist, Stanford Professor

Hema Raghavan

Co-Founder & Head of Engineering, ex-AI Lead, LinkedIn

One Platform. One Model. Infinite Predictions.

KumoRFM

Relational Foundation Model

Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.

For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

Book a Demo Try Free