Your sales team has 500 leads in the pipeline. They can call 50 this week. Which 50?
Most companies answer this question with a point system invented in 2008: +10 for VP title, +5 for visiting the pricing page, +3 for opening an email. The system was designed for a world where the best signal was "they downloaded a whitepaper." In 2026, your data knows far more than that. The question is whether your scoring model can read it.
The cost of getting this wrong is not abstract. Sales reps spend roughly 60% of their time on leads that never convert. That is not a productivity problem. That is a structural failure in how you allocate your most expensive resource: human attention. A rep spending Tuesday afternoon calling a lead who was never going to buy is a rep who did not call the lead who was ready to sign.
This guide covers every approach to lead scoring worth knowing, from the manual spreadsheet to graph-based ML, with honest trade-offs on each. No "it depends" without telling you what it depends on.
Why lead scoring matters (and the math that proves it)
Here is the math that should keep every sales leader up at night. If your overall lead-to-close rate is 5%, and your sales team works leads in roughly random order (which most do, despite having a scoring system), then every rep call has a 5% chance of hitting a buyer. That is 19 wasted conversations for every deal.
Now imagine your scoring model can identify the top 20% of leads that contain 80% of the conversions. Your reps only call that top bucket. The conversion rate in that group is not 5%. It is 20-25%. Same leads, same product, same reps. The only difference is the order in which they make calls.
This is not hypothetical. Forrester found that companies with mature lead scoring generate 50% more sales-ready leads at 33% lower cost per lead. The gap between good scoring and no scoring is often larger than the gap between a good product and a mediocre one. You can have the best product in the market and still lose if your sales team is spending their finite hours on the wrong people.
types_of_lead_scoring
| scoring_type | what_it_captures | example_signals | typical_lift |
|---|---|---|---|
| Demographic | Who the lead is (firmographic and role fit) | Job title, company size, industry, geography | 1.5-2x over random |
| Behavioral | What the lead does (engagement signals) | Pages visited, emails opened, content downloaded, webinars attended | 2-3x over random |
| Predictive | ML-derived probability of conversion | Weighted combination of all available features, trained on historical conversions | 3-5x over random |
| Relational | How the lead connects to other converting entities | Colleague conversions, account-level buying patterns, content progression sequences | 5-8x over random |
Each scoring type builds on the previous. Demographic scoring is the floor. Relational scoring is the ceiling. Most companies are stuck at behavioral.
The problem is not that companies do not score leads. Most do. The problem is that they score them badly and do not realize it because they never measure lift against actual conversions. A scoring system that gives your sales team the same conversion rate as random ordering is not a scoring system. It is a random number generator with a nice UI.
The 5 approaches to lead scoring (an honest comparison)
Every lead scoring approach makes a trade-off between simplicity and signal. Here is the honest rundown, including the things vendors will not tell you.
lead_scoring_approaches_compared
| approach | the_honest_take | data_required | who_builds_it | ceiling |
|---|---|---|---|---|
| Manual point systems | The spreadsheet your marketing team built in 2019. Still used. Still wrong. Points are assigned by gut feeling, never validated against outcomes, and never decayed. | CRM fields + opinions | Marketing ops | Low. The weights are guesses. Nobody validates them against actual close rates. |
| Rules-based (CRM native) | Better than manual, but rules cannot find patterns they were not programmed to look for. Every rule is a hypothesis. Most go untested. | CRM engagement data | RevOps / marketing ops | Moderate. Captures known signals well. Misses everything you have not explicitly coded. |
| Predictive ML (on CRM data) | Real ML on real data. Learns weights from outcomes instead of guessing. The standard for sophisticated teams. | CRM + historical conversion labels | Data science team or AutoML tool | Moderate-High. Good at finding signal in CRM data. Blind to everything outside CRM. |
| Intent-based (6sense, Bombora, ZoomInfo) | Knows what companies are researching across the web. Powerful for identifying in-market accounts. Does not know what they are doing inside your product. | Third-party intent signals + account matching | RevOps + vendor | Moderate-High. Strong for top-of-funnel account identification. Weak for individual lead prioritization. |
| Relational ML | Reads CRM + product usage + support + billing + content interactions as connected data. Finds the colleague signal, content progression, and cross-table patterns that flat models miss. | Multiple connected data sources | Relational ML platform (e.g., Kumo.ai) | Highest. Sees signals that literally cannot exist in a flat table. |
Highlighted: Relational ML is the only approach that reads connected multi-table data natively. But if your data is a single CRM export, predictive ML on that flat table is the pragmatic choice.
Notice the progression. Each approach adds a new category of signal that the previous one could not access. Manual systems guess weights. Rules-based systems hardcode known patterns. Predictive ML learns weights from data. Intent providers add external signals. Relational ML connects all of it.
The uncomfortable truth: most companies are still on approach 1 or 2. They have a point system that nobody has validated in years, running inside a CRM that their sales team has learned to ignore. The score exists. It just does not influence behavior because everyone knows it does not work.
The signals that actually predict conversion
Not all signals are created equal. Some are table stakes that every model includes. Others are genuinely predictive but hidden in tables that most scoring systems never read. Here is the honest breakdown.
lead_scoring_signals
| signal_type | where_it_lives | predictive_power | can_flat_models_see_it |
|---|---|---|---|
| Firmographics (company size, industry, geography) | CRM | Moderate. Filters the pool but does not rank within it. | Yes |
| Behavioral (pages viewed, emails opened, forms submitted) | Marketing automation / CRM | Moderate-High. Volume of activity correlates with interest. | Yes (as counts) |
| Content progression (blog > case study > API docs > demo request) | Marketing automation + product analytics | High. The sequence reveals intent. Someone reading API docs after case studies is buying, not browsing. | NO. Flattened to page_count, the sequence is destroyed. |
| Colleague signal (others at same company purchased) | CRM + billing (cross-referenced) | Very High. 3-5x baseline conversion when a colleague has already bought. | NO. Each lead is an independent row. The connection does not exist. |
| Product usage patterns (feature activation, session depth, invites) | Product analytics | High. The strongest signal for PLG companies. A free user who invited 3 teammates is ready to buy. | Only if someone built the integration and flattened the data into the CRM. |
| Support inquiry topics (pre-sale questions, integration questions) | Support / helpdesk system | High. A lead asking about SSO and compliance is further along than a lead asking what the product does. | Only if someone built the integration. Almost nobody does. |
The highest-predictive signals (content progression, colleague signal) are invisible to flat-table models. They require either manual engineering or a model that reads relational data natively.
Scoring leads from CRM data alone is like judging a restaurant by its menu. You can make reasonable guesses. The prices suggest quality. The cuisine type narrows the field. But you are missing everything that matters: what the regulars order, how long people stay, whether the chef just won an award, and whether the restaurant next door (your competitor) just closed.
Scoring from relational data is like reading every review, seeing the reservation patterns, knowing that three food critics visited last week, and noticing that the same customers who loved Restaurant A also love Restaurant B. The signal is in the connections, not the attributes.
8 methods to improve your lead scoring (ordered by impact)
These are ordered from quickest wins to the most transformative changes. Methods 1-7 work within existing systems. Method 8 changes the architecture entirely.
1. Move beyond firmographics (add behavioral signals)
If your scoring model is still primarily weighting job title, company size, and industry, you are sorting by ICP fit, not by purchase intent. A perfect-fit lead who has never engaged with your content is less likely to close this quarter than an imperfect-fit lead who just spent 45 minutes on your pricing page.
Add behavioral signals: pages visited (especially pricing, integration, and case study pages), email click-throughs (not just opens), content downloads, webinar attendance, and demo requests. These signals capture active interest, not passive fit.
Typical improvement: 2-3x lift over firmographic-only scoring. This is the single biggest jump most teams will see from a single change.
2. Weight recency (a page view yesterday beats a page view last month)
Most point systems treat all activity equally. A pricing page visit 6 months ago gets the same +5 points as a pricing page visit yesterday. This is obviously wrong, but fixing it requires time-decay logic that most CRM-native scoring tools do not support natively.
Apply exponential decay to all behavioral signals. A common formula: score = raw_points * 0.5 ^ (days_since_activity / half_life) where half_life is typically 14-30 days for B2B. An engagement from yesterday retains nearly full value. An engagement from 3 months ago contributes almost nothing.
Typical improvement: 15-25% improvement in precision at the top decile. Cheap to implement, surprisingly impactful.
3. Score engagement depth, not volume
Five minutes on the pricing page is worth more than 50 blog post visits. Three clicks on the API documentation outweigh 100 email opens. Most scoring systems count events. Better systems weight by intent signal strength.
Create a signal hierarchy: demo requests and free trial signups at the top, pricing and integration page visits in the middle, blog views and email opens at the bottom. Then weight accordingly. A lead with 3 high-intent signals should score above a lead with 30 low-intent signals.
Typical improvement: 20-30% improvement in conversion rate for the top-scored bucket. The lift comes from demoting false positives (high volume, low intent) and promoting true positives (low volume, high intent).
4. Add product usage signals (for PLG companies)
If you have a free tier or free trial, your product usage data is the most predictive signal you own and it is probably not in your scoring model. A free user who created a project, invited two teammates, and connected an integration is a better lead than a VP who downloaded your whitepaper.
Key product signals: feature activation breadth (how many features used), depth (how much time in each), social actions (invites, shares, collaborations), and integration connections. These signals are stronger than any marketing engagement metric because they measure actual product value realization, not hypothetical interest.
Typical improvement: 2-4x lift for PLG companies that add product usage to their scoring. This is the most underutilized signal in B2B SaaS.
5. Track content progression (sequence matters, not just count)
The order in which a lead consumes content reveals their stage in the buying journey. Blog post then case study then pricing page then demo request is a textbook buying progression. Blog post then blog post then blog post then blog post is a reader, not a buyer.
Map your content to funnel stages (awareness, consideration, decision) and score progression through stages, not just total consumption. A lead who has touched all three stages in the last 30 days is significantly more likely to convert than a lead with 10x the page views but all in the awareness stage.
Typical improvement: 30-50% improvement in identifying leads in the "decision" stage. Hard to implement in traditional point systems. Natural for ML models that can read sequences.
6. Use conversion velocity (how fast they move through stages)
Two leads both progressed from first touch to demo request. One took 3 days. The other took 6 months. They are not the same. Fast movers typically have an active budget, an urgent problem, or both. Slow movers are often in research mode with no timeline.
Measure the time between stage transitions and use it as a scoring factor. Days from first visit to pricing page. Days from pricing page to demo request. Days from demo to proposal review. Shorter intervals signal urgency and higher close probability.
Typical improvement: 15-20% improvement in identifying deals that will close this quarter vs. next year. Sales teams love this signal because it directly maps to pipeline velocity.
7. Add negative signals (what most scoring systems ignore)
Most scoring systems only add points. They never subtract them. This means a lead who attended a webinar (+10), then unsubscribed from emails (-???), then bounced from the free trial after 2 minutes (-???) still carries their webinar points into perpetuity.
Build in explicit negative signals: email unsubscribes, bounced trial signups, support complaints, meeting no-shows, proposal rejections, and competitor technology detected on their website. These are not just "absence of positive signal." They are active indicators of disinterest or poor fit.
Typical improvement: 10-20% reduction in false positives in the top-scored bucket. Your sales team will notice the difference immediately because they stop getting burned by leads that look active but have already disengaged.
8. Connect your tables (the relational unlock)
Methods 1-7 optimize within a flat table. Method 8 replaces the flat table entirely. Instead of flattening your CRM, product analytics, support tickets, billing records, and content engagement into a single row per lead, a relational model reads all of those tables in their connected structure.
Why does this matter? Because the strongest signals in lead scoring are relational. The colleague signal (someone at the same company converted). The content progression (a specific sequence of pages, not just a count). The product usage pattern (feature A then feature B then integration C, in that order, within 7 days). These signals exist in the connections between tables. Flatten the tables and the connections disappear.
Typical improvement: 3-5x conversion lift over point-based systems. 30-50% improvement over flat-table ML models. This is not an incremental optimization. It is a different category of model reading a different category of data.
The relational advantage: why connected data changes everything
Traditional lead scoring looks at each lead in isolation. Relational scoring sees the web of connections between leads, accounts, products, and behaviors. Here is what that means concretely.
The colleague signal (3-5x conversion lift)
Sarah is a VP of Data Science at a Fortune 500 bank. She visited your pricing page once and downloaded a whitepaper. In a flat-table model, she scores moderately. There are thousands of leads with similar engagement profiles.
But in the relational graph, a critical signal appears: Sarah's colleague in the data engineering department signed a $200K annual contract with you 3 months ago. Two other people from the same company attended your last webinar. And the company's Snowflake usage (visible through a data partnership) has tripled in the last quarter.
Each signal alone is weak. Together, in the graph, they converge to a high-confidence prediction: this account is expanding, internal champions exist, and Sarah is likely the next buyer. The colleague signal alone lifts conversion probability by 3-5x over baseline. And it exists only in the connection between Sarah's record and her colleague's closed-won deal. No amount of feature engineering on Sarah's individual CRM row will find it.
Content progression as a sequence, not a count
Consider two leads. Lead A visited 4 pages: your blog, a case study about their industry, your API documentation, and your pricing page. Lead B visited 40 pages: 40 different blog posts over 6 months.
A flat-table model sees page_count = 4 vs. page_count = 40 and scores Lead B higher. A relational model sees the progression: Lead A moved from awareness (blog) to validation (case study) to technical evaluation (API docs) to commercial evaluation (pricing). That is a buying sequence. Lead B is a blog reader.
The sequence is the signal. The count is noise. But sequences live in event tables with timestamps and page attributes, connected to the lead record through a join. Flatten them and you get a count. Keep the relational structure and you get a trajectory.
PQL scoring with a backward window
Product-qualified leads are best identified by looking at what they did in the product, not what they did on your website. But "what they did" is not a single number. It is a sequence of actions across multiple product tables: feature usage, session logs, invite actions, integration connections, billing events.
PQL Query
PREDICT conversion_30d FOR EACH leads.lead_id WHERE leads.last_active > now() - 60d AND leads.status = 'open'
This query predicts 30-day conversion for active open leads. It reads across CRM data, product usage logs, content engagement, and account-level signals without requiring any manual feature engineering.
Output
| lead_id | conversion_probability | top_driver | recommended_action |
|---|---|---|---|
| L-7201 | 0.89 | Colleague at same company closed $150K deal last month | Executive outreach with case study from their company |
| L-7202 | 0.74 | Progressed blog > case study > API docs > pricing in 5 days | Sales call with technical demo |
| L-7203 | 0.61 | Free trial: activated 4 features, invited 2 teammates | Sales-assisted onboarding offer |
| L-7204 | 0.15 | Opened 3 emails, no clicks, no site visits in 45 days | Nurture sequence (do not call) |
The benchmark: flat scoring vs. relational approach
When teams compare flat-table lead scoring against relational approaches on the same data, the pattern is consistent. The relational model finds signal that the flat model structurally cannot represent.
lead_scoring_flat_vs_relational
| approach | precision_at_top_10_percent | feature_engineering_required | what_it_captures |
|---|---|---|---|
| Manual point system | 10-15% (barely above base rate) | None (but requires ongoing manual tuning) | Gut-feel weights on demographic and behavioral counts |
| ML on flat CRM table | 20-30% | Yes (extensive joins, aggregations, time windows) | Optimized weights on flattened features. Better than manual, same data. |
| Relational approach | 35-50% | No (reads raw relational tables directly) | Colleague signals, content sequences, cross-table patterns, temporal dynamics |
The relational approach achieves 35-50% precision at the top decile vs. 20-30% for flat ML and 10-15% for manual scoring. The difference is not a better algorithm. It is richer data.
Flat-table lead scoring
- One row per lead with aggregated features
- Requires manual SQL joins and feature engineering
- Cannot see colleague conversions or account-level buying patterns
- Cannot capture content progression (flattened to page_count)
- Typical lift: 2-4x over random at the top decile
Relational lead scoring
- Reads all connected tables directly as a graph
- No manual feature engineering required
- Captures colleague signal and account expansion patterns natively
- Reads content sequences as trajectories, not counts
- Typical lift: 5-8x over random at the top decile
Lead scoring tools: an honest comparison
The right tool depends on your data maturity, team size, and sales motion. A PLG startup with 10 reps needs something different than an enterprise with 500 reps and a Salesforce instance that has been customized for 8 years. Here is the honest breakdown.
lead_scoring_tools_compared
| tool | type | price | best_for | honest_limitation |
|---|---|---|---|---|
| Salesforce Einstein | CRM-native ML | Included in Enterprise+ ($165+/user/mo) | Salesforce-heavy orgs that want scoring without a separate tool. Zero setup if your data is already in SFDC. | Only sees what is in Salesforce. If your best signals are in product analytics or support tools, Einstein is blind to them. |
| HubSpot | CRM-native scoring | Included in Professional+ ($800+/mo) | SMB and mid-market teams already on HubSpot. Manual and predictive scoring built in. | Predictive scoring is a black box. Manual scoring requires ongoing tuning that nobody does after the first month. |
| 6sense | Intent + ABM platform | Enterprise pricing ($50K+/yr typical) | Enterprise ABM motions where identifying in-market accounts is the bottleneck. Strong intent data. | Scores accounts, not individual leads. The intent signal tells you who is researching, not who will buy from you specifically. |
| ZoomInfo | Data enrichment + intent | Enterprise pricing ($15K+/yr typical) | Enriching lead records with firmographic data and tracking buyer intent signals at the account level. | Intent data is directional, not precise. 'Company X is researching data platforms' does not mean they will buy yours. |
| MadKudu | Predictive scoring for PLG | Growth pricing ($20K+/yr typical) | PLG companies that need to score free users for sales-assist. Built for the product-to-sales handoff. | Designed for PLG. If your motion is outbound enterprise sales without a free tier, most of MadKudu's value proposition does not apply. |
| DataRobot | AutoML platform | Enterprise pricing | Data science teams that want automated model selection on a pre-built feature table. | Automates model selection, not feature engineering. You still build the flat table, which is where 80% of the work lives. |
| Kumo.ai | Relational foundation model | Free tier / Enterprise | Multi-table predictions without feature engineering. Reads CRM + product + support + billing data as a connected graph. | Requires relational data across multiple systems. If everything lives in a single CRM table, Salesforce Einstein is simpler. |
Highlighted: Kumo.ai is the only tool that reads multi-table relational data natively, capturing colleague signals and content progressions without feature engineering. But if your data lives in a single CRM, start with your CRM's native scoring.
Picking the right tool for your situation
- No data team, need scoring now: HubSpot or Salesforce Einstein. Turn on the built-in scoring. It will be better than nothing and better than the manual point system nobody has updated in two years.
- Enterprise ABM motion, need account-level intent: 6sense or ZoomInfo for identifying in-market accounts. Layer CRM-native scoring on top for individual lead prioritization within those accounts.
- PLG company, need to identify sales-ready free users: MadKudu for the product-to-sales handoff. Or connect your product analytics to your CRM and let Salesforce Einstein score the enriched records.
- Multi-table data, want maximum accuracy without months of feature engineering: Kumo.ai reads your connected data sources directly and captures cross-table signals that flat-table tools miss. The colleague signal alone is worth the integration effort.
- Data science team available, flat table ready: XGBoost or LightGBM for maximum control. DataRobot if you want automated model selection on top.
6 lead scoring mistakes that are costing you deals
These mistakes are not edge cases. They are the default. Most scoring systems in production today have at least three of these problems. Some have all six.
1. Scoring activity volume, not intent quality
A lead who opened 50 emails and clicked on none of them is not an engaged lead. They are a lead with a habit of scanning subject lines. A lead who opened 2 emails, clicked through to the pricing page, and then visited the API docs has 1/25th the email opens and 100x the purchase intent.
If your scoring system gives higher scores to the first lead, it is optimizing for activity, not intent. Weight by signal quality, not signal volume. One pricing page visit is worth more than 50 email opens.
2. Never decaying old scores
A lead who was red-hot 6 months ago and has gone completely silent since still carries those points in most scoring systems. They show up in the "hot leads" list alongside leads who engaged yesterday. Your sales rep calls them and gets voicemail or "we went with a competitor in Q3."
Every signal should decay over time. A 30-day half-life is a reasonable default for B2B: an engagement from a month ago is worth half its original points. An engagement from 3 months ago is worth one-eighth. If your scoring tool does not support decay natively, build a scheduled job that reduces scores by 50% monthly.
3. Ignoring negative signals
Most scoring systems only add points. They never subtract. A lead who unsubscribed from your emails, bounced from the free trial after one session, and then marked your outreach as spam still carries their original webinar attendance points. Build in explicit deductions: unsubscribes, meeting no-shows, trial abandonment, competitor mentions in support conversations, and direct "not interested" responses.
4. Using the same model for all segments
The signals that predict conversion for a PLG self-serve customer are completely different from the signals that predict conversion for an enterprise deal. Product usage dominates for PLG. Executive engagement and procurement signals dominate for enterprise. Scoring both with the same model forces a compromise that works poorly for everyone.
Build separate models for distinct buyer segments, or at minimum, ensure your model includes segment as a feature so it can learn different weights for different populations.
5. Not validating against actual conversions
This is the most damaging mistake and it is shockingly common. Teams build a scoring model, deploy it, and never check whether high-scored leads actually convert at higher rates. The score becomes organizational furniture: everyone sees it, nobody trusts it, and sales develops their own informal prioritization that may or may not be better.
Every month, pull conversion rates by score decile. If the top decile is not converting at 3x+ the base rate, your model is not working. Fix it or remove it. A bad scoring model is worse than no scoring model because it gives a false sense of precision.
6. Only using CRM data
Your CRM contains demographics, deal stages, and activity timestamps. It does not contain what the lead did inside your product, what they asked your support team, how they progressed through your content, or whether their colleagues have already purchased. The richest signals live outside the CRM, in product analytics, support systems, billing platforms, and the relational connections between all of them.
A scoring model that only reads CRM data is like a doctor who only checks your blood pressure. It is a real measurement. It is just not enough information to make a diagnosis.