Skip to main content

Overview

The SDK Code feature in tables, graphs, training jobs, and prediction jobs provides users with the ability to reproduce a setup from the UI into the SDK. This feature generates a fully executable Python file with Kumo SDK code where users only need to supply their Kumo authentication details and connector credentials to run the code. The main motivations behind this feature are to:
  1. Make existing objects (tables, graphs) and jobs (training and prediction jobs) easy to replicate from the UI into the SDK.
  2. Make the Kumo SDK more intuitive and accessible to use.
This feature is available on Kumo versions v2.16+ and Kumo SDK versions v2.10+.

What does the Python generated file look like?

For brevity, we show an example of how the Python file looks like for a table. The generated files for graphs, training, and prediction jobs follow the same pattern.
import kumoai as kumo
from kumoai.graph import Column
from kumoapi.typing import Stype, Dtype

# Please update your target URL and kumo API key.
API_URL = ""
API_KEY = ""
kumo.init(url=API_URL, api_key=API_KEY)

quickstart_config = {'connector_root_dir': 's3://kumo-public-datasets/quickstart', 'data_source_type': 'S3', 'connector_id': 'quickstart'}

quickstart_connector = kumo.S3Connector(
    root_dir=quickstart_config["connector_root_dir"],
    _connector_id = quickstart_config["connector_id"]
)

transactions_config = {
    'connector_id': 'quickstart',
    'table_name': 'transactions',
    'file_type': 'PARQUET',
    'table_name_alias': 'transactions1',
    'cols': [
        Column(name='t_dat', stype=Stype.timestamp, dtype=Dtype.date, timestamp_format=None),
        Column(name='customer_id', stype=Stype.ID, dtype=Dtype.string, timestamp_format=None),
        Column(name='article_id', stype=Stype.ID, dtype=Dtype.int64, timestamp_format=None),
        Column(name='price', stype=Stype.numerical, dtype=Dtype.float64, timestamp_format=None),
        Column(name='sales_channel_id', stype=Stype.ID, dtype=Dtype.int64, timestamp_format=None)
    ],
    'pkey': None,
    'time_col': 't_dat',
    'end_time_col': None
}

transactions = kumo.Table(
    source_table=quickstart_connector.table(
        transactions_config["table_name"]
    ),
    columns=transactions_config["cols"],
    primary_key=transactions_config['pkey'],
    time_column=transactions_config['time_col'],
    end_time_column=transactions_config["end_time_col"]
)

How to use the SDK Code feature in Kumo’s UI?

For any table, graph, training, or prediction job that you would like to re-create in the SDK, navigate to that page in your Kumo web app and selected the button SDK Code from the drop-down menu. A popup, with the fully generated code, should appear on the screen. Users have two options - to either copy the generated code and paste it in their working environment or download the generated code in a ready-to-use python file.

What is included in the generated SDK file?

The generated file should contain all the imports and configs that are necessary to create an object (table or graph) or a job (training or prediction job). Every file stars with a standard header that contains all the necessary package imports and Kumo SDK initialization.
  • Table files: connector(s) and table configs and Kumo SDK code.
  • Graph files: connector(s), table(s), and graph configs and Kumo SDK code.
  • Training job files: connector(s), table(s), graph configs, training table job, and training job configs and Kumo SDK code. This includes the complete (custom) model plan configuration.
  • Prediction job files: connector(s), table(s), graph configs, training table, training, prediction table, and prediction job configs and Kumo SDK code. Usage of different prediction export connector is also supported.

Will my configuration be preserved in the generated SDK file?

Yes, all the custom configurations for any type of object or job should be preserved. For example, if the user chooses to change num_experiments in their model plan, the changes introduced will be preserved in the generated file. These object configs are represented as Python dictionaries.
I