Skip to main content

Documentation Index

Fetch the complete documentation index at: https://kumo.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

Kumo can be deployed in three ways: (a) the Kumo SaaS deployment provides an Apache Spark-based data platform, expanded choice of supported data warehouses, earlier feature access, quicker bug fixes, and easier access to Kumo’s enterprise support tier. (b) The Data Warehouse Native option allows you to bring your own data platform (Snowflake or Databricks) and access Kumo’s highly accurate predictions without raw data stored or materialized outside of your boundary. (c) The Virtual Private Cloud option runs Kumo as a self-contained Kubernetes deployment inside your VPC/VNet with your compute, storage, network, identity provider, and data source permissions. See Virtual Private Cloud for details. For proof-of-concept evaluations, Kumo also offers a Simplified VPC Deployment: a self-contained machine image that can be installed quickly in AWS, Azure, or Google Cloud with no external network egress and Parquet-only data access. Data storage is always in your environment regardless of deployment option, and you can choose the deployment option that best aligns to their organizational data governance policies and preferences. The following table provides an overview of the different deployment options:
SaaSVirtual Private CloudDatabricksSnowflake Native App
Customer data warehouseAmazon Redshift, AWS S3, Google Cloud BigQuery, Databricks, SnowflakeSnowflake, Databricks, BigQuery, S3, other lake/warehouse sourcesDatabricksSnowflake
Data storageYour ownYour own object storage and KMSYour ownSnowflake objects (e.g., tables, views, etc.) in your own account
Data cacheKumo-owned and managedCustomer storage in your VPC/VNet (bucket/blob)Customer’s Databricks Unity Catalog Volume storageSnowflake stage in your own account
Data platformKumo-managed Apache SparkCustomer-owned clusters (EKS/AKS/GKE) connecting to your warehouses/lakesDatabricks Spark in your own Databricks accountSnowpark DataFrame in your own Snowflake account via Snowpark Container Services
ML computeKumo-owned and managed computeCustomer-managed autoscaling GPU and high-memory nodes inside your VPC/VNetKumo-owned and managed computeKumo services in Snowpark containers
PII and sensitive data handling (GDPR compliance)Your own retention policyKumo services and artifacts run in your VPC/VNet; source access is governed by customer-owned permissionsNo data on disk in a Kumo-owned environment; data leaving your environment is transformed and encodedNo data on disk in a Kumo owned environment; no data leaves your environment
Access to private preview featuresYesYesYesNo
Access to Enterprise support tierYesYes (via your VDI/bastion; no persistent Kumo access required)YesKumo requires temporary elevated access and logs