Quick Start
Tip
This page helps you set up fully managed cloud version of Kumo. To start using Kumo SDK, follow the instructions here.
In this guide, we walk you through an example to train a model to predict churn using the H&M Kaggle dataset. The data is pre-loaded in an Amazon S3 bucket. You’ll learn how to connect data sources, choose tables, create a graph schema, and train a model.
Step 1: Connect data
Kumo integrates with a variety of data sources to ingest data for training and running model regularly. For this example, we will use the H&M dataset pre-loaded in a S3 bucket.
-
Select
Connectors
from the side navigation menu (bottom left of your screen) -
Choose the
Amazon S3
option. Enter the S3 path:s3://kumo-public-datasets/quickstart/
and add the other details to create a new connector.
Step 2: Choose tables
We will now select and configure tables for model training.
-
Navigate to
Tables
from the side navigation menu and selectConnect Table
. -
Select the
customers
table. Kumo will process the data and automatically infer data and semantic type for each column. You can update from dropdown as needed. -
Select the property for columns – assign a primary key and create/end date for temporal data.
Repeat the steps for transactions
, and articles
table.
Step 3: Create graph schema
Now that your tables are selected, it’s time to link them to create a graph schema.
-
Navigate to
Graphs
and select the tables we just created. -
You can preview the graph on the right as the tables are selected.
-
Kumo will link the tables based on primary and foreign key names. You can updates these links as needed in the next step.
We are now ready to train our model.
Step 4: Train model
-
From the side menu, click
New
>Training
. -
Select the graph you created in the last step.
-
Enter the Predictive Query. In this query, we are attempting to predic the likelihood of no transaction (i.e. churn) over the next 90 days among customers who have done any number of transactions in the last 60 days.
- Hit
Start Training
.
Training can take a few minutes to complete. Once complete, you can:
-
Review the Evaluation metrics.
-
Get Explanations for the overall model or individual predictions.
-
Run the model for Batch Predictions or embeddings with new data.
Step 5: Connect your own data
The next step is to train models with your own data in Kumo. You can select from a variety of other data connectors. To use Kumo in your own data warehouse, you can check alternative deployment options here.