Overview

Kumo’s Native App uses Snowflake resources depending on the size of the data and the complexity of the models being trained. These resources are primarily consumed in two ways:

  1. Running containers in

    Snowpark Container Services (SPCS)

    .

  2. Executing

    Snowpark queries

    for data processing using Snowflake warehouses

The total resource usage in Snowflake is determined by:

  • The number of hours the app is running.

  • The selected hardware and warehouse size configuration.

  • The number of training or prediction jobs executed.

  • The size of the input data for each job.

SPCS Resource Usage

Kumo runs two key components on Snowpark Container Services:

  • Kumo Control Plane

    (UI and API management).

  • Kumo AI Engine

    (model training and inference).

How Resource Consumption is Measured: As of 11-Feb-2025 Snowflake charges based on:

  • Hardware configuration

    (CPU, GPU, Memory allocated to the compute instance used).

  • Duration

    for which the compute instance runs

Compute Resources:

  • Both the

    Control Plane

    and

    AI Engine

    run on a single

    GPU compute instance

    in SPCS.

  • As of 11-Feb-2025, tThe available instance types are

    GPU_NV_M

    and

    GPU_NV_L

    , selected at app launch.

Managing Resources Efficiently:

  • The compute resource is used continuously while the app is running.

  • The app

    can be suspended

    without losing data to save resources.

  • Instructions on suspending the app are available

    here

    .

Estimated Credit Usage: The table below provides an estimate of Snowflake credits consumed per hour based on the selected container size. The most up to date numbers can be found in the official Snowflake Credit Consumption Table.

SPCS ContainerNumber of RowsSnowflake Credits Per Hour
GPU_NV_M< 250M2.68
GPU_NV_L< 1B14.12

Warehouse Resource Usage

To process data for training and predictions, Kumo executes Snowpark queries using Snowflake virtual warehouses. This includes:

  • Reading input tables from Snowflake.

  • Preparing data for training and prediction.

  • Generating artifacts that are transferred to the GPU.

Factors Affecting Credit Usage:

  • The

    shape and structure

    of the input data.

  • The

    complexity

    of queries being executed.

  • The

    configuration of the job

    (e.g., if a job processes an expensive Snowflake view, extra compute resources will be used).

  • Using the

    Graph Snapshot

    feature can reduce resource usage across multiple jobs.

The following table provides an estimated credit consumption per job based on data size.

Data SizeNumber of RowsWarehouse SizeSnowflake Credits Per Job
1-9GB1-24MS1-3
10-90GB25-240MM3-9
100-250GB250M-1BL10-20

Other Resource Usage

Storage:

  • Kumo stores various types of data within Snowflake, including:

    • Temporary artifacts from training.

    • Metadata for tables, graphs, and jobs.

    • Model binaries.

  • A typical deployment with around

    100 jobs

    is expected to consume less than 1TB of data.

  • This would be billed at your account’s standard rate for storage.

Example Calculation

Scenario: Weekly Email Recommendations

  • Data Size:

    4 tables (users, products, purchases, browsing history) with a total of

    100M rows, 50GB

    .

  • Training Frequency:

    Once per month.

  • Prediction Frequency:

    Once per week.

Estimated Monthly Resource Usage:

  • SPCS Container:

    GPU_NV_M

  • App Run Time:

    ~12 hours per week

  • SPCS Credits per Month:

    12 * 4 * 2.68 =

    128.64 credits

  • Snowpark Credits Per Job:

    6

  • Jobs per Month:

    5

  • Snowpark Credits per Month:

    5 * 6 =

    30 credits

  • Total Monthly Credit Usage:158.64 credits

  • Storage Cost: 500GB

Upcoming Feature: Kumo Autoscaling on SPCS

In Q1 2025, Kumo will introduce dynamic autoscaling for GPUs in SPCS. This will:

  • Automatically scale

    GPU resources up or down based on demand.

  • Eliminate the need for manual app launch and suspension.

  • Optimize infrastructure usage, particularly for large workloads.

We expect that this change will simplify management and reduce infrastructure resources requirements.

Single Host Architecture

Autoscaling Architecture