Skip to main content
KumoRFM can connect directly to DuckDB databases, automatically inferring table metadata and relationships from the schema.

Installation

The DuckDB backend requires the ADBC DuckDB driver:
pip install kumoai[duckdb]

Quick Start

Connect to a file-based DuckDB database:
import kumoai.rfm as rfm

graph = rfm.Graph.from_duckdb(connection="my_database.duckdb")
This will:
  1. Connect to the DuckDB database
  2. Discover all non-temporary, non-internal tables automatically
  3. Infer column metadata (data types, semantic types, primary keys, time columns)
  4. Detect foreign key relationships
  5. Print a summary of the inferred metadata and links

Specifying Tables

Control which tables to include and customize their configuration:
graph = rfm.Graph.from_duckdb(
    connection="my_database.duckdb",
    tables=[
        "USERS",
        {"name": "ORDERS", "source_name": "ORDERS_SNAPSHOT"},
        {"name": "ITEMS", "primary_key": "ITEM_ID"},
    ],
)
Table configuration options:
KeyDescriptionRequired
nameThe table name used in PQL queriesYes
source_nameThe actual table name in the database (if different from name)No
primary_keyOverride the auto-detected primary keyNo
time_columnThe name of the time column for this tableNo
end_time_columnThe name of the end time column for this tableNo
columnsSelected source columns or column specs, including expression columnsNo

Connection Options

In-memory database:
from kumoai.rfm.backend.duckdb import connect

conn = connect()  # in-memory DuckDB database
graph = rfm.Graph.from_duckdb(connection=conn)
From a file path:
graph = rfm.Graph.from_duckdb(connection="path/to/database.duckdb")
From an existing ADBC connection:
from kumoai.rfm.backend.duckdb import connect

conn = connect("path/to/database.duckdb")
graph = rfm.Graph.from_duckdb(connection=conn)
From a connection config dict:
graph = rfm.Graph.from_duckdb(
    connection={"uri": "path/to/database.duckdb"},
)

Controlling Metadata Inference

graph = rfm.Graph.from_duckdb(
    connection="my_database.duckdb",
    infer_metadata=False,
    verbose=False,
)

graph.infer_metadata()
graph.infer_links()

Manual Edge Specification

Override automatic link detection by providing edges explicitly:
graph = rfm.Graph.from_duckdb(
    connection="my_database.duckdb",
    edges=[
        ("ORDERS", "USER_ID", "USERS"),
        ("ORDERS", "ITEM_ID", "ITEMS"),
    ],
)