Tip

This page helps you set up fully managed cloud version of Kumo. To start using Kumo SDK, follow the instructions here.

In this guide, we walk you through an example to train a model to predict churn using the H&M Kaggle dataset. The data is pre-loaded in an Amazon S3 bucket. You’ll learn how to connect data sources, choose tables, create a graph schema, and train a model.

Step 1: Connect data

Kumo integrates with a variety of data sources to ingest data for training and running model regularly. For this example, we will use the H&M dataset pre-loaded in a S3 bucket.

  • SelectConnectors from the side navigation menu (bottom left of your screen)

  • Choose theAmazon S3 option. Enter the S3 path: s3://kumo-public-datasets/quickstart/ and add the other details to create a new connector.

Step 2: Choose tables

We will now select and configure tables for model training.

  • Navigate to Tables from the side navigation menu and select Connect Table.

  • Select the customers table. Kumo will process the data and automatically infer data and semantic type for each column. You can update from dropdown as needed.

  • Select the property for columns – assign a primary key and create/end date for temporal data.

Repeat the steps for transactions, and articles table.

Step 3: Create graph schema

Now that your tables are selected, it’s time to link them to create a graph schema.

  • Navigate to Graphs and select the tables we just created.

  • You can preview the graph on the right as the tables are selected.

  • Kumo will link the tables based on primary and foreign key names. You can updates these links as needed in the next step.

We are now ready to train our model.

Step 4: Train model

  • From the side menu, click New > Training.

  • Select the graph you created in the last step.

  • Enter the Predictive Query. In this query, we are attempting to predic the likelihood of no transaction (i.e. churn) over the next 90 days among customers who have done any number of transactions in the last 60 days.

  • Hit Start Training.

Training can take a few minutes to complete. Once complete, you can:

  • Review the Evaluation metrics.

  • Get Explanations for the overall model or individual predictions.

  • Run the model for Batch Predictions or embeddings with new data.

Step 5: Connect your own data

The next step is to train models with your own data in Kumo. You can select from a variety of other data connectors. To use Kumo in your own data warehouse, you can check alternative deployment options here.