Skip to main content

Documentation Index

Fetch the complete documentation index at: https://kumo.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

Once you’ve connected your source tables and applied any necessary transformations, you can construct a Graph consisting of Table objects. A Kumo Graph represents a connected set of Tables, with each table fully specifying the relevant metadata (including selected source columns, column data type and semantic type, and relational constraint information) of SourceTables for modeling purposes.

Creating Tables from Source

A Table can be constructed from a SourceTable in multiple ways. The simplest approach is to call from_source_table():
# NOTE if `columns` is not specified, all source columns are included:
customer = kumo.Table.from_source_table(
    source_table=customer_src,
    primary_key='CustomerID',
).infer_metadata()

transaction = kumo.Table.from_source_table(
    source_table=transaction_src,
    time_column='InvoiceDate',
).infer_metadata()

Inspecting Table Metadata

To verify the metadata that was inferred, call the metadata property:
>>> print(customer.metadata)

+----+-----------+---------+---------+------------------+------------------+----------------------+
|    | name      | dtype   | stype   | is_primary_key   | is_time_column   | is_end_time_column   |
|----+-----------+---------+---------+------------------+------------------+----------------------|
|  0 | StockCode | string  | ID      | True             | False            | False                |
+----+-----------+---------+---------+------------------+------------------+----------------------+
If any column properties are not specified to your liking, you can edit them by accessing their names in the table.

Building Tables from Scratch

You can also specify the table from the ground-up, optionally inferring metadata for any non-fully-specified columns:
stock = kumo.Table(
    source_table=stock_src,
    columns=dict(name='StockCode', stype='ID'),  # will infer dtype='string'
    primary_key='StockCode',
).infer_metadata()

# Validate the table's correctness:
stock.validate(verbose=True)

Modifying Table Metadata

No matter how you create your table, Table exposes methods to inspect and adjust metadata:
# Set and access a data type for a column ("StockCode") in the stock table:
stock.column("StockCode").dtype = "string"
print(stock["StockCode"].dtype)
Note that column() returns a Column object, which contains the relevant metadata for the column of a table. You can also modify the primary key, add or remove columns:
# Set the primary key:
table.primary_key = 'new_primary_key'

# Unset (remove) the primary key:
table.primary_key = None

# Check if a table has a primary key:
print(f"Table has primary key? {table.has_primary_key()}")

# Add a new column:
table.add_column(name="col", dtype="int")

# Edit the column's semantic type:
table.column("col").stype = "categorical"

# Remove a column:
table.remove_column("col")

Table Identity

Tables do not have names in the Kumo SDK — a table is fully specified by its configuration in code. Two notebooks using the same table configuration refer to the same table object in the Kumo backend. Editing a table creates a new object in the backend, independent of other tables.
We encourage users to fully specify their tables in production code to avoid unexpected re-inference of metadata.