What data is needed for conversion attribution?

Kumo connects directly to your existing relational tables: USERS, TOUCHPOINTS, CONVERSIONS, CAMPAIGNS, CHANNELS. No ETL or feature engineering required. Write a PQL query and get explainable predictions in minutes.

2Multi-label Classification · Attribution

Conversion Attribution

“Which touchpoints drove this conversion?”

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

A real-world example

Which touchpoints drove this conversion?

Last-click attribution overvalues bottom-funnel channels and undervalues awareness campaigns, leading to systematic misallocation of ad spend. A brand spending $50M per year on digital ads misallocates 30-40% of budget when relying on last-click, representing $15-20M in wasted or suboptimal spend annually.

Quick answer

Graph-based attribution models the full user journey across channels by connecting touchpoints, conversions, campaigns, and user segments in a relational graph. Instead of over-crediting the last click, the GNN learns how different touchpoint sequences contribute to conversion, assigning credit proportional to each channel's true causal impact.

Approaches compared

4 ways to solve this problem

1. Last-click / first-click attribution

Assign 100% of conversion credit to the final (or first) touchpoint in the user journey. The default in most analytics platforms.

Best for

Simplicity. Works when you need a quick directional read and your conversion paths are short (1-2 touchpoints).

Watch out for

Systematically overvalues bottom-funnel channels and undervalues awareness. Brands that optimize on last-click consistently overspend on search and underspend on display and email.

2. Shapley value / data-driven attribution

Use game-theoretic methods to distribute credit across touchpoints based on their marginal contribution. Google Analytics 4 uses a version of this.

Best for

Better than rule-based models. Accounts for the contribution of each channel relative to all possible orderings.

Watch out for

Assumes touchpoint independence -- the order doesn't matter. In reality, seeing a display ad before an email changes the email's conversion impact. Also computationally expensive for long journeys.

3. Marketing mix modeling (MMM)

Regress aggregate sales on channel-level spend over time. Top-down approach that works with aggregated data, no user-level tracking needed.

Best for

Privacy-compliant environments where user-level tracking is limited. Good for budget allocation across broad channels.

Watch out for

Cannot attribute at the user or conversion level. Misses within-channel dynamics and individual journey patterns. Too coarse for tactical optimization.

4. KumoRFM (relational graph ML)

Connect users, touchpoints, conversions, campaigns, and channels into a single graph. The GNN learns how touchpoint sequences and channel interactions drive conversion at the individual level.

Best for

Highest accuracy for multi-touch attribution. Captures interaction effects between channels (display priming email, email nurturing search) that linear and Shapley models miss.

Watch out for

Requires user-level journey data with timestamps and clear conversion events. Not suited for environments with only aggregate channel data.

Key metric: Brands using graph-based multi-touch attribution recover 30-40% of previously misallocated ad spend compared to last-click models.

Why relational data changes the answer

Attribution is a path problem, not a point problem. The value of an email touchpoint depends on whether a display ad preceded it, which publisher served that display ad, what campaign objective drove the email, and how many days passed between them. Flat attribution models treat each touchpoint independently or assume fixed position-based weights, collapsing this rich sequential structure into a single credit-assignment rule.

Relational models preserve the full journey graph. They learn that for Enterprise users acquired through organic search, a display-then-email-then-search sequence converts at 3x the rate of search alone, and that the display ad on TechNews contributes more than the same creative on a generic news site. On the RelBench benchmark, relational approaches score 76.71 vs 62.44 for single-table baselines -- a gap that in attribution terms means the difference between correctly reallocating $15M and continuing to waste it on over-credited channels.

Last-click attribution is like giving the assist in basketball to whoever made the final pass before a layup. You miss the point guard who ran the play, the screen-setter who created the opening, and the ball handler who drove the defense to collapse. Graph-based attribution watches the entire play from the opening whistle and credits every player for the role they actually played in the score.

How KumoRFM solves this

Graph-powered intelligence for advertising

Kumo connects users, touchpoints, conversions, campaigns, and channels into a unified graph. The GNN learns how different touchpoint sequences contribute to conversion, capturing interaction effects between channels that linear models and even Shapley-based approaches miss. Each conversion gets a per-touchpoint attribution score grounded in the full user journey.

From data to predictions

See the full pipeline in action

Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.

Your data

The relational tables Kumo learns from

USERS

user_id	segment	acquisition_source	ltv_tier
U201	Enterprise	Organic	High
U202	SMB	Paid Search	Medium
U203	Enterprise	Referral	High

TOUCHPOINTS

touchpoint_id	user_id	channel	campaign_id	timestamp
TP401	U201	Display	CMP10	2025-02-15 09:00
TP402	U201	Email	CMP11	2025-02-18 14:30
TP403	U201	Paid Search	CMP12	2025-02-20 11:00

CONVERSIONS

conversion_id	user_id	value	timestamp
CVR101	U201	$12,500	2025-02-20 11:15
CVR102	U203	$8,200	2025-02-22 16:00

CAMPAIGNS

campaign_id	channel	spend	objective
CMP10	Display	$120K	Awareness
CMP11	Email	$15K	Nurture
CMP12	Paid Search	$80K	Conversion

CHANNELS

channel	avg_cpa	avg_roas	attribution_window
Display	$45	3.2x	30 days
Email	$12	8.1x	7 days
Paid Search	$28	5.5x	14 days

Write your PQL query

Describe what to predict in 2–3 lines — Kumo handles the rest

PQL

PREDICT LIST_DISTINCT(TOUCHPOINTS.channel, -30, 0, days)
FOR EACH CONVERSIONS.conversion_id

Prediction output

Every entity gets a score, updated continuously

CONVERSION_ID	USER_ID	VALUE	DISPLAY_ATTR	EMAIL_ATTR	SEARCH_ATTR
CVR101	U201	$12,500	0.28	0.35	0.37
CVR102	U203	$8,200	0.15	0.52	0.33

Understand why

Every prediction includes feature attributions — no black boxes

Conversion CVR101 -- User U201

Predicted: Multi-touch: Display 28%, Email 35%, Search 37%

Top contributing features

Email open-to-conversion time

2 days

35% attribution

Search keyword intent score

High

25% attribution

Display ad first-touch awareness

5 days prior

20% attribution

Cross-channel journey length

3 touchpoints

12% attribution

Similar user conversion paths

68% match

8% attribution

Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability

PQL Documentation

Learn the Predictive Query Language — SQL-like syntax for defining any prediction task in 2–3 lines.

Read docs

Python SDK

Integrate Kumo predictions into your pipelines. Train, evaluate, and deploy models programmatically.

Read docs

Explainability Docs

Understand feature attributions, model evaluation metrics, and how to build trust with stakeholders.

Read docs

Frequently asked questions

Common questions about conversion attribution

What is the best multi-touch attribution model?

Graph-based attribution models outperform both rule-based (last-click, linear) and statistical (Shapley, Markov chain) approaches because they capture interaction effects between channels and temporal sequencing. The key advantage is learning that a display-then-email sequence has a different conversion impact than email-then-display, something position-based and Shapley models cannot express.

How do you replace last-click attribution?

Start by collecting user-level journey data with timestamps across all channels. Then move to a model that respects the sequential, multi-entity nature of the data. Graph-based models are the most accurate option because they connect users, touchpoints, campaigns, and channels into a single structure. Expect to recover 30-40% of previously misallocated ad spend.

What data do you need for multi-touch attribution?

User-level touchpoint data with channel, campaign, and timestamp for each interaction, plus clear conversion events with value. For best results, add channel metadata (average CPA, ROAS), campaign objectives, and user segments. More connected tables means more signal for the model to learn causal contribution patterns.

How does attribution work without third-party cookies?

First-party data becomes critical. Graph models excel here because they extract more signal from the data you do have -- connecting CRM records, email engagement, on-site behavior, and conversion events through natural foreign keys. You lose some cross-site tracking, but the relational structure of your first-party data compensates significantly.

What is the ROI of better attribution modeling?

Brands spending $50M+ on digital ads typically misallocate 30-40% of budget under last-click. Switching to graph-based attribution recovers $15-20M in wasted spend by shifting budget from over-credited bottom-funnel channels to under-credited awareness and nurture campaigns that actually drive incremental conversions.

Bottom line: A brand spending $50M on digital ads recovers $15-20M in misallocated budget by replacing last-click with Kumo's graph-based multi-touch attribution. Every channel gets credit proportional to its true causal contribution.

Related use cases

Explore more ad tech use cases

Use Case #1CTR PredictionLearn more

Use Case #3Audience ModelingLearn more

Use Case #4Bid OptimizationLearn more

Previous#1 CTR Prediction

Next#3 Audience Modeling

Topics covered

conversion attribution AImulti-touch attribution modelad attribution MLmarketing mix modelinggraph-based attributionKumoRFM attributioncross-channel attributiondata-driven attribution

From a leadership team with proven experience

Vanja Josifovski

CEO and Co-Founder, ex-CTO Airbnb, ex-CTO Pinterest

Jure Leskovec

Co-Founder & Chief Scientist, Stanford Professor

Hema Raghavan

Co-Founder & Head of Engineering, ex-AI Lead, LinkedIn

One Platform. One Model. Infinite Predictions.

KumoRFM

Relational Foundation Model

Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.

For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

Book a Demo Try Free