Solution Background and Business Value

Predicting customer lifetime value (LTV) is essential for businesses to optimize marketing strategies, enhance customer retention, and maximize revenue. By accurately forecasting how much a customer is likely to spend in the future, companies can:

  • Allocate marketing resources efficiently.

  • Identify and retain high-value customers.

  • Personalize promotions based on expected spending behavior.

LTV models can be combined with churn prediction and coupon affinity models to boost retention efforts and maximize profitability.

Data Requirements and Kumo Graph Schema

We start with a core set of tables and extend our model by incorporating more customer behavior signals over time.

Core Tables

  1. Customers

    • customer_id (Primary Key)

    • name, email, phone

    • registration_date

    • address

  2. Orders

    • order_id (Primary Key)

    • customer_id (Foreign Key to Customers)

    • product_id (Foreign Key to Products)

    • order_date, quantity, price

Additional Tables (Optional Enhancements)

  1. Products

    • product_id (Primary Key)

    • product_name, category, price, cost

  2. Order Events

    • order_id (Foreign Key to Orders)

    • event_type (payment, delivery status, etc.)

    • event_date, amount

  3. Customer Interactions

    • interaction_id (Primary Key)

    • customer_id (Foreign Key to Customers)

    • interaction_date, interaction_type, interaction_details

  4. Returns

    • return_id (Primary Key)

    • order_id (Foreign Key to Orders)

    • product_id (Foreign Key to Products)

    • return_date, return_reason, refund_amount

  5. Customer Loyalty

    • loyalty_id (Primary Key)

    • customer_id (Foreign Key to Customers)

    • loyalty_points, membership_level, points_earned, points_redeemed

  6. Marketing Campaigns

    • campaign_id (Primary Key)

    • customer_id (Foreign Key to Customers)

    • campaign_type, campaign_date, campaign_response

Entity Relationship Diagram

Predictive Queries

LTV can be defined in multiple ways, depending on business needs. Common approaches include:

  • Predicting total spending per customer within a given time frame.

  • Forecasting purchase frequency and average order value.

  • Integrating customer engagement signals from interactions and campaigns.

Here are some example predictive queries:

  1. Predict customer spending in the next 6 months:

    PREDICT SUM(Orders.price, 0, 180, days)
    FOR EACH Customers.customer_id
    
  2. Predict transaction volume for active customers:

    PREDICT COUNT(Orders.order_id, 0, 180, days)
    FOR EACH Customers.customer_id
    WHERE COUNT(Orders.*, -30, 0, days) > 0
    

Building model in Kumo SDK

1. Initialize the Kumo SDK

import kumoai as kumo

kumo.init(url="https://<customer_id>.kumoai.cloud/api", api_key=API_KEY)

2. Select tables

connector = kumo.S3Connector("s3://your-dataset-location/")

customers = kumo.Table.from_source_table(
    source_table=connector.table('customers'),
    primary_key='customer_id',
).infer_metadata()

orders = kumo.Table.from_source_table(
    source_table=connector.table('orders'),
    time_column='order_date',
).infer_metadata()

products = kumo.Table.from_source_table(
    source_table=connector.table('products'),
    primary_key='product_id',
).infer_metadata()

3. Create graph schema

graph = kumo.Graph(
    tables={
        'customers': customers,
        'orders': orders,
        'products': products,
    },
    edges=[
        dict(src_table='orders', fkey='customer_id', dst_table='customers'),
        dict(src_table='orders', fkey='product_id', dst_table='products'),
    ],
)
graph.validate(verbose=True)

4. Train the Model

pquery = kumo.PredictiveQuery(
    graph=graph,
    query="PREDICT SUM(orders.price, 0, 180, days) FOR EACH customers.customer_id"
)
pquery.validate(verbose=True)

model_plan = pquery.suggest_model_plan()
trainer = kumo.Trainer(model_plan)
training_job = trainer.fit(
    graph=graph,
    train_table=pquery.generate_training_table(non_blocking=True),
    non_blocking=False,
)
print(f"Training metrics: {training_job.metrics()}")

Deployment Strategy

Automating LTV Predictions for Business Growth

  1. Predict LTV and churn probabilities for all active customers.

  2. Store the predictions in the data warehouse.

  3. Use the scores to prioritize marketing efforts (e.g., personalized discounts for high-value customers at risk of churning).

  4. Automate these steps using orchestration tools like Airflow or Dagster.

To further refine the LTV model, consider:

  • Combining LTV with churn models for a more holistic view of customer retention.

  • Using marketing response data to identify customers most likely to engage with promotions.

  • Incorporating external data sources (e.g., economic trends, industry benchmarks) to enhance predictive accuracy.