Model Training - Kumo.ai

Here, we discuss the final step of the Kumo workflow: training a model, and generating predictions. The primary interface used here is Trainer, which you may already be familiar with if you have worked with other machine learning libraries (e.g. scikit-learn) before. A Trainer has two important methods:

fit(), which takes a Graph and TrainingTable (or a TrainingTableJob, if the training table was generated in a non-blocking manner), and trains a model on this graph and training table.
predict(), which takes a Graph and PredictionTable (or a PredictionTableJob, if the prediction table was generated in a non-blocking manner), a job ID corresponding to a trained model, and other parameters detailing where to output the predictions. It generates predictions for each entity in the prediction table, and writes the outputs to the specified output connector.

Training a model is fully customizable, with a detailed suite of model plan options. For a guide on tuning your model for optimal performance, see here.

You can view all your launched jobs in the Kumo UI, at the URL https://<customer_id>.kumoai.cloud/jobs. Jobs are keyed by their unique job ID, and contain all specified job tags as well.

How do I create a Trainer? What’s a model plan?

Creating a Trainer object requires a model plan, which defines the search space to be used when exploring model configurations for model training. You can suggest a model plan for your predictive query with suggest_model_plan(), which will produce an object of type `ModelPlan`:

pquery = kumoai.PredictiveQuery(graph=..., query="...")
model_plan = pquery.suggest_model_plan()

print(model_plan)

The model plan can be edited with full granularity; see here for documentation, and the ModelPlan object for the exposed customizable attributes. Once you have customized your model plan to your liking, you can create a Trainer by simply passing the model plan in:

trainer = kumoai.Trainer(model_plan)

That’s all!

How do I train a model?

Training a model amounts to calling fit(), which accepts the following arguments:

A Graph, which defines the data that the model will be trained on. Note if you have already called snapshot(), this snapshot of the data will be used when training your model.
A TrainingTable or TrainingTableJob, generated by generate_training_table(). This defines the training examples that will be used by the model; if a TrainingTableJob is passed, its execution will be sequenced before training by the Kumo platform.
non_blocking, which can be set to True if you would like to schedule training and return immediately, or False if you would like to wait for training to complete.
custom_tags, which define a custom mapping of key/value tags that you can use to label your training job.

Training will return a TrainingJobResult if non_blocking=False and training completes successfully, or a TrainingJob if non_blocking=True. Each training job is associated with a unique Job ID, starting with trainingjob-. An example invocation of fit() is as follows:

graph = kumoai.Graph(...)
pquery = kumoai.PredictiveQuery(graph=graph, query="...")

# Generate the training table, but do not wait for its completion; just
# schedule it using `non_blocking=True`:
training_table_plan = pquery.suggest_training_table_plan()
training_table = pquery.generate_training_table(
    training_table_plan, non_blocking=True)

# Create a trainer with a suggested model plan:
model_plan = pquery.suggest_model_plan()
trainer = kumoai.Trainer(model_plan)

# Schedule a training job (`non_blocking=True`) given on the defined graph
# and training table future:
training_job_future = trainer.fit(
    graph=graph,
    train_table=training_table,
    non_blocking=True,
    custom_tags={'author': 'trial'},  # any custom key/value pairs
)

# Print the training job ID:
print(f"Training job ID: {training_job_future.id}")

# Attach to the training job to watch its status and see logs (you can
# detach anytime without canceling the job):
training_job_future.attach()

How do I view the metrics and artifacts of a trained model?

Recall that a trained model is represented by a TrainingJobResult object; if you have a TrainingJob, you need to await its completion by calling result() before proceeding. A TrainingJobResult exposes numerous methods to help analyze the performance of a trained model, including metrics() and holdout_df(). A full set of visualizations, performance graphs, and explainability can all be accessed at the URL specified by :py`tracking_url`.

How do I generate predictions?

Predicting on a trained model amounts to calling predict(), which accepts the following arguments:

A Graph, which defines the data that the model will use to make predictions on. Note if you have already called snapshot(), this snapshot of the data will be used when generating predictions.
A PredictionTable or PredictionTableJob, generated by generate_prediction_table() or supplied via a custom path. This defines the prediction examples that will be used by the model; if a PredictionTableJob is passed, its execution will be sequenced before prediction by the Kumo platform.
training_job_id, which defines the job ID of the training job whose model will be used for making predictions.
non_blocking, which can be set to True if you would like to schedule prediction and return immediately, or False if you would like to wait for prediction to complete.
custom_tags, which define a custom mapping of key/value tags that you can use to label your training job.
additional arguments documented in predict() that can be used to specify where predictions should be output to.

Prediction will return a BatchPredictionJobResult if non_blocking=False and prediction completes successfully, or a BatchPredictionJob if non_blocking=True. Each batch prediction job is associated with a unique Job ID, starting with bp-job-. An example invocation of predict() is as follows:

# Assume we have a completed training job id:
completed_job_id = "<completed_training_job_id>"

# Output connector:
output_connector = ...  # any Kumo Connector

# Load the trainer and predictive query from a completed training job:
trainer = kumoai.Trainer.load(completed_job_id)
pquery = kumoai.PredictiveQuery.load_from_training_job(completed_job_id)

# Generate the prediction table, but do not wait for its completion; just
# schedule it using `non_blocking=True`:
prediction_table_plan = pquery.suggest_prediction_table_plan()
prediction_table = pquery.generate_prediction_table(
    prediction_table_plan, non_blocking=True)

# Schedule a prediction job (`non_blocking=True`) given on the defined
# graph and prediction table future:

# For v1.4 and above:
from kumoai.artifact_export.config import OutputConfig
# For v1.3 and below (backward compatible):
# from kumoai.trainer.config import OutputConfig

prediction_job_future = trainer.predict(
    graph=graph,
    prediction_table=prediction_table,
    training_job_id=completed_job_id,
    non_blocking=True,
    custom_tags={'author': 'trial'},  # any custom key/value pairs
    output_config=OutputConfig(
        output_types={'predictions', 'embeddings'},
        output_connector=output_connector,
        output_table_name='kumo_predictions',
    ),
)

# Print the prediction job ID:
print(f"Prediction job ID: {prediction_job_future.id}")

# Attach to the prediction job to watch its status and see logs (you can
# detach anytime without canceling the job):
prediction_job_future.attach()

How do I poll a training or prediction job’s status?

Any job scheduled with non_blocking=True will be represented as a Future object, that has various methods to poll the scheduled job for its status or completion. Common patterns include:

Querying future.status() for the status of the scheduled job in a loop
Calling future.attach() to attach to the future and print logs periodically; when the future is complete, this method will return the resolved output (e.g. TrainingJob becomes TrainingJobResult)
Calling future.result() will block until the future is complete, and return the resolved output.

​How do I create a Trainer? What’s a model plan?

​How do I train a model?

​How do I view the metrics and artifacts of a trained model?

​How do I generate predictions?

​How do I poll a training or prediction job’s status?

How do I create a Trainer? What’s a model plan?

How do I train a model?

How do I view the metrics and artifacts of a trained model?

How do I generate predictions?

How do I poll a training or prediction job’s status?