Skip to main content

Documentation Index

Fetch the complete documentation index at: https://kumo.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

Once you’ve set up the Graph of your Tables, you can define a machine learning problem as a Kumo PredictiveQuery on your Graph. Predictive queries are written using the predictive query language (PQL), a concise SQL-like syntax that allows you to define a model for a new business problem. For information on the construction of a query string, please visit the Kumo documentation.

Writing a Query

In this example, we predict customer lifetime value — modeled as a regression problem to predict the maximum quantity of transactions for each customer over the next 30 days, given that the customer has made over 15 units worth of transactions in the past 7 days:
pquery = kumo.PredictiveQuery(
    graph=graph,
    query=(
        "PREDICT MAX(transaction.Quantity, 0, 30)\n"
        "FOR EACH customer.CustomerID\n"
        "ASSUMING SUM(transaction.UnitPrice, 0, 7, days) > 15"
    ),
)

# Validate the predictive query syntax:
pquery.validate(verbose=True)

Validating a Query

The SDK provides quick ways to confirm your query matches expectations before generating data.
  • validate() checks for syntax errors and guides you toward a correct formulation.
  • get_task_type() returns the task type of the query (e.g. binary classification, regression) so you can confirm the ML problem matches your intent.
# Confirm the task type:
print(pquery.get_task_type())

Generating a Training Table

Once your query is validated, generate a training table to use for model fitting:
# Optionally get a suggested plan (can be customized):
training_table_plan = pquery.suggest_training_table_plan()

# Generate the training table (non_blocking=True schedules it in background):
training_table = pquery.generate_training_table(training_table_plan, non_blocking=False)

# Inspect the generated training data:
print(training_table.data_df().head())
If you don’t need a custom plan, omit it and Kumo will use an intelligently inferred default.

Generating a Prediction Table

A prediction table is generated in the same way:
# Optionally get a suggested plan:
prediction_table_plan = pquery.suggest_prediction_table_plan()

# Generate the prediction table:
prediction_table = pquery.generate_prediction_table(prediction_table_plan, non_blocking=False)

# Inspect the generated prediction data:
print(prediction_table.data_df().head())