What data is needed for consumption anomaly detection?

Kumo connects directly to your existing relational tables: METERS, READINGS, CUSTOMERS, WEATHER, TARIFFS. No ETL or feature engineering required. Write a PQL query and get explainable predictions in minutes.

3Binary Classification · Anomaly Detection

Consumption Anomaly Detection

“Which meters show abnormal consumption?”

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

A real-world example

Which meters show abnormal consumption?

Non-technical losses (energy theft, meter tampering, billing errors) cost utilities $96B globally per year. In developed markets, non-technical losses average 1-3% of revenue. Traditional threshold-based detection catches obvious anomalies but misses sophisticated theft patterns where consumption is gradually reduced or shifted. For a utility with $4B in annual revenue, a 1% non-technical loss rate means $40M in recoverable revenue.

Quick answer

Consumption anomaly detection AI identifies energy theft, meter tampering, and billing errors by comparing each meter's consumption against a graph-learned baseline of its neighborhood, customer type, and weather conditions. Unlike threshold-based systems that only catch obvious anomalies, graph-based models detect gradual theft (consumption slowly reduced over months) and sophisticated tampering (consumption shifted between meters). Utilities with $4B in annual revenue typically recover $20-30M in non-technical losses that traditional detection misses.

Approaches compared

4 ways to solve this problem

1. Threshold-Based Alerts

Flag meters whose consumption drops below a fixed threshold or changes by more than a set percentage. The baseline approach available in most billing systems.

Best for

Catching obvious anomalies like meter disconnections, dramatic consumption drops, and billing system errors.

Watch out for

Cannot detect sophisticated theft where consumption is gradually reduced 10-15% over months. Generates high false positive rates (30-50%) because consumption legitimately varies with weather, vacations, and lifestyle changes. Field crews waste time investigating false alarms.

2. Statistical Profiling

Build consumption profiles per customer segment and flag meters deviating from their segment average. More nuanced than fixed thresholds.

Best for

Detecting meters that deviate significantly from their customer class (residential, commercial, industrial) norms.

Watch out for

Segment averages are too coarse. A 2,400 sqft home with 3 occupants on a standard tariff should use 28 kWh/day, but the segment average includes 1,200 sqft apartments and 4,000 sqft homes. False positive rates remain high (20-30%) because legitimate variation within segments is large.

3. Single-Table ML (Isolation Forest/Autoencoders)

Train anomaly detection models on individual meter time series to learn normal consumption patterns and flag deviations. Captures complex temporal patterns better than statistics.

Best for

Detecting anomalies in individual meter time series where the meter's own history provides a strong baseline.

Watch out for

Learns each meter's pattern independently. Cannot detect that Meter MTR102 using 12 kWh while its neighbors all use 28 kWh is suspicious. If a thief gradually reduces consumption, the model's baseline adjusts to the lower level and stops flagging it. No neighborhood context means no comparative baseline.

4. Graph Neural Networks (Kumo's Approach)

Connect meters, readings, customers, weather, and tariffs into a consumption graph. GNNs learn normal consumption per meter relative to its neighborhood, customer type, and conditions, detecting anomalies by neighborhood deviation.

Best for

Catching sophisticated theft, gradual tampering, and meter degradation by comparing each meter to its graph-learned neighborhood baseline.

Watch out for

Requires AMI data with meter-level readings and geographic/circuit topology linking meters to neighborhoods. Less effective in areas with very low meter density where neighborhood baselines are sparse.

Key metric: Graph-based anomaly detection catches 2-3x more non-technical losses than threshold-based systems while reducing false positives 60-70%. For a $4B utility, this means recovering $20-30M annually in energy theft and meter fraud.

Why relational data changes the answer

Energy theft is relational by nature. A sophisticated thief does not reduce consumption to zero (that would trigger threshold alerts). Instead, they gradually reduce consumption by 15-20% over several months by tampering with the meter or bypassing it partially. The meter's own time series adapts to the lower level, making it invisible to per-meter anomaly detection. But compared to the neighborhood baseline, the theft is obvious: Meter MTR102 using 12 kWh while every similar home within 500 meters uses 25-30 kWh on the same weather day. The anomaly is visible only in the relational context.

Graph-based models represent this context directly. Each meter's expected consumption is conditioned on its neighbors, customer type, weather, and tariff structure. When a meter deviates from its graph-learned baseline, the model flags it as anomalous with confidence proportional to the deviation and the strength of the neighborhood signal. SAP's SALT benchmark shows graph models at 91% accuracy vs 63% for gradient-boosted trees on relational tasks. RelBench confirms at 76.71 vs 62.44. In theft detection, this translates to catching 2-3x more non-technical losses while reducing false positives by 60-70%. For a utility with $4B in annual revenue, that means recovering $20-30M vs $8-12M with traditional methods.

Detecting energy theft by monitoring individual meters is like catching a student cheating by looking only at their own test scores. A gradual decline looks like normal performance variation. But when you compare their scores to their study group, the pattern is obvious: everyone who sat next to each other in class got the same answers, but this student's performance diverged. Graph-based anomaly detection compares each meter to its 'study group' of neighbors, making subtle theft patterns visible.

How KumoRFM solves this

Graph-powered intelligence for energy and utilities

Kumo connects meters, readings, customers, weather, and tariffs into a consumption graph. The GNN learns normal consumption patterns per meter relative to its neighbors, customer type, weather, and tariff structure. Anomalies are detected not by absolute thresholds but by deviation from the graph-learned baseline: when a meter's consumption diverges from its neighborhood while weather and tariff conditions remain similar. This catches gradual theft and meter degradation that threshold-based systems miss.

From data to predictions

See the full pipeline in action

Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.

Your data

The relational tables Kumo learns from

METERS

meter_id	customer_id	type	install_date	zone_id
MTR101	CUST01	Smart	2021-03-15	ZONE-A
MTR102	CUST02	Smart	2020-08-20	ZONE-A
MTR103	CUST03	Legacy	2015-01-10	ZONE-B

READINGS

meter_id	date	daily_kwh	peak_kw	power_factor
MTR101	2025-03-01	32	4.2	0.95
MTR102	2025-03-01	12	2.1	0.72
MTR103	2025-03-01	85	12.5	0.88

CUSTOMERS

customer_id	type	sqft	occupants	tariff
CUST01	Residential	2,200	4	Standard
CUST02	Residential	2,400	3	Standard
CUST03	Commercial	8,500	N/A	Commercial

WEATHER

zone_id	date	avg_temp_f	heating_degree_days
ZONE-A	2025-03-01	55	10
ZONE-B	2025-03-01	52	13

TARIFFS

tariff_id	name	rate_per_kwh	peak_rate
T01	Standard	$0.12	$0.18
T02	Commercial	$0.09	$0.14

Write your PQL query

Describe what to predict in 2–3 lines — Kumo handles the rest

PQL

PREDICT BOOL(READINGS.is_anomaly, 0, 1, days)
FOR EACH METERS.meter_id

Prediction output

Every entity gets a score, updated continuously

METER_ID	CUSTOMER	DAILY_KWH	EXPECTED_KWH	ANOMALY_PROB
MTR101	CUST01	32	30	0.08
MTR102	CUST02	12	28	0.91
MTR103	CUST03	85	82	0.12

Understand why

Every prediction includes feature attributions — no black boxes

Meter MTR102 -- Residential customer CUST02 in ZONE-A

Predicted: 91% anomaly probability (12 kWh vs 28 kWh expected)

Top contributing features

Consumption 57% below neighborhood average

12 vs 28 kWh

32% attribution

Power factor degradation

0.72 (normal: 0.90+)

25% attribution

Gradual decline over 60 days

-45% trend

19% attribution

Weather conditions should increase consumption

55F (heating)

14% attribution

Similar home size/occupancy uses 28 kWh

2,400 sqft / 3 people

10% attribution

Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability

PQL Documentation

Learn the Predictive Query Language — SQL-like syntax for defining any prediction task in 2–3 lines.

Read docs

Python SDK

Integrate Kumo predictions into your pipelines. Train, evaluate, and deploy models programmatically.

Read docs

Explainability Docs

Understand feature attributions, model evaluation metrics, and how to build trust with stakeholders.

Read docs

Frequently asked questions

Common questions about consumption anomaly detection

How much revenue do utilities lose to energy theft?

Non-technical losses (theft, meter tampering, billing errors) average 1-3% of revenue in developed markets and 10-25% in emerging markets. For a US utility with $4B in annual revenue, that is $40-120M in annual losses. Current detection methods recover 30-40% of these losses. Graph-based models can increase recovery to 60-75%, adding $20-30M in revenue without serving a single additional customer.

What types of energy theft can AI detect?

Graph-based models detect: meter bypassing (consumption drops while neighborhood stays constant), meter tampering (power factor degradation, erratic readings), billing fraud (incorrect tariff classification), and meter degradation (gradual accuracy loss). The hardest to catch is gradual theft, where consumption is slowly reduced over months. This is where neighborhood-based baselines provide the most value, because the per-meter baseline adjusts to the lower level while the neighborhood baseline does not.

What is the false positive rate for AI-based theft detection?

Threshold-based systems produce 30-50% false positives (field crews investigate and find no theft). Graph-based models reduce this to 10-15% by conditioning anomaly scores on neighborhood context, weather, and customer type. This means fewer wasted field investigations and better crew productivity. The field investigation cost ($200-500 per visit) means reducing false positives has direct ROI beyond the recovered revenue.

Can consumption anomaly detection work with legacy (non-smart) meters?

Partially. Legacy meters provide monthly reads rather than interval data, which limits detection to large, sustained anomalies. Graph-based models still add value by comparing monthly consumption across neighborhoods, but the detection granularity is lower. AMI (smart meter) data with 15-minute intervals enables detection of time-shifted theft, power factor anomalies, and gradual changes that monthly reads miss. AMI coverage is the single biggest driver of detection capability.

How do utilities act on theft detection alerts?

The typical workflow is: model flags high-probability anomalies, field investigation team visits the top-scored meters, investigators confirm theft or identify the cause (meter malfunction, billing error, legitimate change). Confirmed theft leads to back-billing (recovering lost revenue, often 12-24 months), meter replacement, and in some cases referral to law enforcement. The key metric is revenue recovered per field investigation, which graph-based models improve 3-5x over threshold-based approaches.

Bottom line: A utility with $4B in annual revenue recovers $20-30M by detecting non-technical losses that threshold-based systems miss. Kumo's consumption graph identifies meters deviating from graph-learned neighborhood baselines, catching gradual theft and meter degradation.

Related use cases

Explore more energy & utilities use cases

Use Case #1Grid Load ForecastingLearn more

Use Case #2Outage PredictionLearn more

Use Case #4Renewable Generation ForecastingLearn more

Previous#2 Outage Prediction

Next#4 Renewable Generation Forecasting

Topics covered

consumption anomaly detection AIenergy theft detectionmeter anomaly MLnon-technical losses utilitysmart meter analyticsKumoRFM anomalyutility fraud detectionabnormal consumption model

From a leadership team with proven experience

Vanja Josifovski

CEO and Co-Founder, ex-CTO Airbnb, ex-CTO Pinterest

Jure Leskovec

Co-Founder & Chief Scientist, Stanford Professor

Hema Raghavan

Co-Founder & Head of Engineering, ex-AI Lead, LinkedIn

One Platform. One Model. Infinite Predictions.

KumoRFM

Relational Foundation Model

Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.

For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

Book a Demo Try Free