A KumoDocumentation Index
Fetch the complete documentation index at: https://kumo.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Graph is a fundamental concept in the SDK. It links multiple Table objects (each created from a SourceTable) into a relational schema that represents the relationships between tables for a specific business problem. Graphs are used as input to predictive queries and training jobs.
Column
The metadata for a single column in aTable is represented by a Column object. Columns can be fetched from a table with Table.column() and modified by adjusting their properties.
Related: Dtype, Stype.
Column
The name of this column.
The semantic type. Can be specified as a string — see
Stype for valid values.The data type. Can be specified as a string — see
Dtype for valid values.For timestamp columns, the format string used to parse the value. Intelligently inferred by Kumo if not specified.
Table
ATable represents the full metadata for a table in a Kumo Graph. Unlike a SourceTable (which is just a reference to data behind a connector), a Table specifies selected columns, their data and semantic types, and relational constraint information (primary key, time column, end time column).
Table
The source table this Kumo table is created from.
The columns to include. Defaults to all columns from the source table. Each column must have its
dtype and stype specified.The primary key column, if present. Must exist in
columns.The time column, if present. Must exist in
columns.The end time column, if present. Must exist in
columns.from_source_table() staticmethod
Convenience constructor that creates a Table from a SourceTable.
The source table to create from.
Column names to include. All columns are included if not specified.
The primary key column name.
The time column name.
The end time column name.
Table
columns property
Returns List[Column] — All columns in this table.
primary_key property
Returns Optional[Column] — The primary key column, or None.
time_column property
Returns Optional[Column] — The time column, or None.
end_time_column property
Returns Optional[Column] — The end time column, or None.
column()
Returns the named column.
The column name.
Column
has_column()
The column name.
bool — True if the column exists in this table.
add_column()
Adds a Column to this table.
remove_column()
The column name to remove.
Table
infer_metadata()
Infers any missing dtype and stype values from the source table.
Whether to print progress output.
Table
validate()
Validates the table configuration for use with Kumo.
Whether to print validation output.
Table
get_stats()
Fetches column statistics from a snapshot of this table.
The snapshot wait level.
pd.DataFrame
save()
Saves the table to Kumo and returns its ID.
Optional name to save the table under.
str
load() classmethod
Loads a previously saved table.
The table ID or named template.
Table
print_definition()
Prints the full table definition with placeholder names.
Graph
AGraph represents a full relational schema over a set of Table objects, including the primary key / foreign key relationships between them. Once a graph is created, you are ready to write a PredictiveQuery and train a model.
Graph
Tables in the graph, keyed by unique table name.
Foreign key relationships between tables. Each edge specifies
(src_table, fkey, dst_table).id property
Returns str — A unique identifier derived from the graph’s schema. Two graphs with any difference in their tables or columns are guaranteed to have distinct IDs.
snapshot_id property
Returns Optional[GraphSnapshotID] — The snapshot ID, if available.
tables property
Returns Dict[str, Table]
edges property
Returns List[Edge]
table()
The table name.
Table
has_table()
The table name.
bool
add_table()
The name to register the table under.
The table to add.
remove_table()
The name of the table to remove.
link()
Adds a foreign key edge to the graph.
The edge to add.
infer_metadata()
Infers missing metadata in all tables in the graph.
Whether to print progress output.
Graph
infer_links()
Automatically detects foreign key relationships between tables.
Whether to print progress output.
Graph
validate()
Validates the graph structure before use with a predictive query.
Whether to print validation output.
Graph
get_table_stats()
Fetches statistics for all tables in the graph.
The snapshot wait level.
Dict[str, pd.DataFrame]
get_edge_stats()
Returns GraphHealthStats — Health statistics for each edge in the graph.
visualize()
Exports the graph structure as a Graphviz diagram.
Output path for the diagram.
Whether to include column names in the diagram.
save()
Saves the graph to Kumo.
Optional name for the saved graph.
Whether to skip validation before saving.
str
load() classmethod
Loads a previously saved graph.
The graph ID or named template.
Graph
print_definition()
Prints the full graph definition with placeholder names.
Edge
Represents a foreign key relationship between two tables. Edges are always bidirectional within Kumo.
The source table name. This table must have a foreign key column
fkey that links to the destination table’s primary key.The name of the foreign key column in the source table.
The destination table name. Must have a primary key that the source table’s foreign key references.
GraphHealthStats
Contains edge-level health statistics computed as part of a Graph snapshot. Index with an Edge object to retrieve per-edge statistics.