Snowflake
Kumo can directly connect to your Snowflake data warehouse. This connection method ensures that all data stays in your control and preserves the integrity and security of your environment. This way of connection is recommended for running Kumo in production.
-
To set up a new Snowflake direct connection, click on Connectors in the left-hand column, followed by the Configure Connector button on the “Connectors” page.
-
On the “New Connector” window, provide a name for your new Snowflake connector and click the Snowflake button. The configuration settings for connecting to your Snowflake data warehouse will immediately appear below.
-
Provide the following details in the Snowflake Warehouse section to connect your Snowflake data warehouse:
-
Account Identifier - The account identifier uniquely identifies your Snowflake account. This should be provided as
ORGNAME-ACCOUNT_NAME
. TheORGNAME
andACCOUNT_NAME
can be retrieved for your Snowflake account using the instructions here. -
Database - The Snowflake database where the input relational data exists. The user who is authenticating must have a
DEFAULT_ROLE
withUSAGE
privileges to this database. -
Warehouse - The warehouse that will be used to read and process data in Snowflake. The user who is authenticating must have a
DEFAULT_ROLE
withUSAGE
privileges to this warehouse. -
Schema Name - The schema under the Database where the input tables are to be loaded from. Make sure that the user who is authenticating has a role that has
USAGE
andSELECT
privileges on the schema. If predictions are to be written back using this connector, the user’sDEFAULT_ROLE
should haveCREATE TABLE
privilege on this schema. -
User - The username that you want to use to connect to Snowflake.
-
Password - The password of the user who is connecting the connector.
Click on the Done button save your new Snowflake connector.
Using key-pair authentication
For Snowflake accounts with MFA or SSO enabled, Kumo requires the use of public key authentication, either with a regular user account that has a key-pair setup in addition to a username and password, or with a “service” account which does not have a username and password login (see CREATE USER for creating service accounts).
To use Kumo with a key-pair authorization, first generate a private key (encrypted or unencrypted) and then a corresponding public key. See Snowflake documentation on how to do this with OpenSSL. Next, assign the public key to an existing Snowflake user with the command (excluding delimiters from the key):
ALTER USER example_user SET RSA_PUBLIC_KEY='MIIBIjANBgkqh...';
Note that this must be done by the owner of the user or a user with SECURITYADMIN
role or higher. Alternatively, create a new user with the RSA_PUBLIC_KEY
field set (see CREATE USER).
When a new user or service user is created with the key-pair authentication configured, this user’s role must also be assigned the necessary privileges to access the data (see below for the minimum required privileges to connect data in Kumo).
Now instead of “User” and “Password” credentials, key-pair authentication can be used to configure a Snowflake connector in Kumo. On the New Connector window, toggle on “Key-Pair Authentication” and enter the “User” and “Private key”. If the “Private key” is encrypted enter the “Key Passphrase”; otherwise passphrase can be left blank. The private key should be entered with line breaks preserved.
Click on the Done button save your new Snowflake connector.
Minimum privileges required for the connecting user
When connecting to Snowflake, the user’s DEFAULT_ROLE
is used by Kumo. To check the default role of the user, run DESCRIBE USER <username>
in Snowflake.
The following is the minimum set of privileges required to create a Snowflake connector to read data into Kumo and write predictions back. The default role of the user (user_role
in the commands below) used to create the Snowflake Connector must be granted these privileges to successfully connect your Snowflake data to Kumo.
The commands below assuming you are connecting a Database called customer_db
and schema customer_schema
in that Database using a warehouse customer_warehouse
VPN-Protected Snowflake Instances
If your Snowflake instance is behind a VPN, you will need to add the Kumo Cloud Network gateway to your allowlist. More information can be found here.
Snowflake Data Warehouse Sizing
Kumo recommends the following data warehousing size guidelines, based on the total data size of your largest table:
Largest Table Size | Warehouse size (SaaS) |
---|---|
Up to 10 GB | Small |
Up to 50 GB | Large |
Up to 100 GB | Large |
Up to 1 TB | 4x-Large |
Kumo can directly connect to your Snowflake data warehouse. This connection method ensures that all data stays in your control and preserves the integrity and security of your environment. This way of connection is recommended for running Kumo in production.
-
To set up a new Snowflake direct connection, click on Connectors in the left-hand column, followed by the Configure Connector button on the “Connectors” page.
-
On the “New Connector” window, provide a name for your new Snowflake connector and click the Snowflake button. The configuration settings for connecting to your Snowflake data warehouse will immediately appear below.
-
Provide the following details in the Snowflake Warehouse section to connect your Snowflake data warehouse:
-
Account Identifier - The account identifier uniquely identifies your Snowflake account. This should be provided as
ORGNAME-ACCOUNT_NAME
. TheORGNAME
andACCOUNT_NAME
can be retrieved for your Snowflake account using the instructions here. -
Database - The Snowflake database where the input relational data exists. The user who is authenticating must have a
DEFAULT_ROLE
withUSAGE
privileges to this database. -
Warehouse - The warehouse that will be used to read and process data in Snowflake. The user who is authenticating must have a
DEFAULT_ROLE
withUSAGE
privileges to this warehouse. -
Schema Name - The schema under the Database where the input tables are to be loaded from. Make sure that the user who is authenticating has a role that has
USAGE
andSELECT
privileges on the schema. If predictions are to be written back using this connector, the user’sDEFAULT_ROLE
should haveCREATE TABLE
privilege on this schema. -
User - The username that you want to use to connect to Snowflake.
-
Password - The password of the user who is connecting the connector.
Click on the Done button save your new Snowflake connector.
Using key-pair authentication
For Snowflake accounts with MFA or SSO enabled, Kumo requires the use of public key authentication, either with a regular user account that has a key-pair setup in addition to a username and password, or with a “service” account which does not have a username and password login (see CREATE USER for creating service accounts).
To use Kumo with a key-pair authorization, first generate a private key (encrypted or unencrypted) and then a corresponding public key. See Snowflake documentation on how to do this with OpenSSL. Next, assign the public key to an existing Snowflake user with the command (excluding delimiters from the key):
ALTER USER example_user SET RSA_PUBLIC_KEY='MIIBIjANBgkqh...';
Note that this must be done by the owner of the user or a user with SECURITYADMIN
role or higher. Alternatively, create a new user with the RSA_PUBLIC_KEY
field set (see CREATE USER).
When a new user or service user is created with the key-pair authentication configured, this user’s role must also be assigned the necessary privileges to access the data (see below for the minimum required privileges to connect data in Kumo).
Now instead of “User” and “Password” credentials, key-pair authentication can be used to configure a Snowflake connector in Kumo. On the New Connector window, toggle on “Key-Pair Authentication” and enter the “User” and “Private key”. If the “Private key” is encrypted enter the “Key Passphrase”; otherwise passphrase can be left blank. The private key should be entered with line breaks preserved.
Click on the Done button save your new Snowflake connector.
Minimum privileges required for the connecting user
When connecting to Snowflake, the user’s DEFAULT_ROLE
is used by Kumo. To check the default role of the user, run DESCRIBE USER <username>
in Snowflake.
The following is the minimum set of privileges required to create a Snowflake connector to read data into Kumo and write predictions back. The default role of the user (user_role
in the commands below) used to create the Snowflake Connector must be granted these privileges to successfully connect your Snowflake data to Kumo.
The commands below assuming you are connecting a Database called customer_db
and schema customer_schema
in that Database using a warehouse customer_warehouse
VPN-Protected Snowflake Instances
If your Snowflake instance is behind a VPN, you will need to add the Kumo Cloud Network gateway to your allowlist. More information can be found here.
Snowflake Data Warehouse Sizing
Kumo recommends the following data warehousing size guidelines, based on the total data size of your largest table:
Largest Table Size | Warehouse size (SaaS) |
---|---|
Up to 10 GB | Small |
Up to 50 GB | Large |
Up to 100 GB | Large |
Up to 1 TB | 4x-Large |
Before initiating the sharing process with Kumo, please ensure the following:
1. Provider Sharing is Enabled
-
Log into your Snowflake account.
-
Navigate to Account > Policies.
-
Ensure Provider Sharing is set to
Enabled
. If not, activate it.
For more about Secure Data Sharing, please see Snowflake’s official documentation.
2. Identify Your Snowflake Account’s Region
Data sharing is most straightforward when provider and consumer accounts are in the same Snowflake region. If they are in different regions, the data provider needs to use Snowflake’s “Data Replication” to replicate the data to the region where the consumer account resides before sharing. Kumo does not recommend Data Replication. Kumo has accounts in most of the global regions.
Access the Snowflake web interface—your account’s region will be displayed in the top right corner, next to the account name (e.g., us-west-2
). Remember to share with Kumo’s account in the corresponding region because secure share is only allowed with accounts in the same cloud and region.
Identify Your Snowflake Account Edition
The “Secure Data Sharing” feature is available in all editions of Snowflake, including Standard, Enterprise, Business Critical, and Virtual Private Snowflake (VPS). Thus, an account using the Standard edition can share data with an account using the Enterprise edition and vice versa. An account on the BUSINESS CRITICAL edition is restricted from sharing data with an account on a lower edition.
Role Permissions
The role used to create the share must have the necessary permissions on the objects being shared.
You will not incur any additional storage costs for the shared data. When querying your shared data, Kumo incurs the computational costs.
Creating a Snowflake Secure Share for Kumo
You can use Snowflake Secure Shares to share data with Kumo. This allows you to share the following Database objects (see https://docs.snowflake.com/en/user-guide/data-sharing-intro for more details):
-
Tables
-
External tables
-
Secure views
-
Secure materialized views
1. Establishing the Share
Create a share: Use the CREATE SHARE
command to create an empty share. For example:
Use the GRANT <privilege> … TO SHARE
command to add a database to the share and then selectively grant access to specific database objects (schemas, tables and secure views) to the share. For example:
Notes:
-
Only users with the
CREATE SHARE
privilege can create a secure share. Only theACCOUNTADMIN
has this privilege by default and must be granted to the role creating the secure share for Kumo. See the Snowflake documentation for more details. -
Only Secure views can be shared using Snowflake Secure shares.
Do not use SELECT(*) when creating Snowflake views, as this can break if your source tables change. Whenever possible, you should connect your raw tables to Kumo and avoid Snowflake Views.
2. Sharing with Kumo
Add KUMO account (select east or west locator based on where your account is located) access to the share. Use the ALTER SHARE
command to add one or more accounts access to the share. For example:
Refer to the region your Snowflake account is in and use the appropriate Kumo account from this table below:
Cloud | Region | Locator | Org Name | Account Name | Edition |
---|---|---|---|---|---|
AWS | US West (Oregon) | YRB86739 | LFWGWBP | ZXA66432 | Business Critical |
AWS | US West (Oregon) | YRB86739 | LFWGWBP | KUMOUSWEST | Enterprise |
AWS | US East (N.Virginia) | IUB99615 | LFWGWBP | KUMO_US_EAST1 | Enterprise |
AWS | US East (Ohio) | RR45566 | LFWGWBP | KUMO_US_EAST_OHIO | Enterprise |
AWS | US East (N.Virginia) | CZB55260 | LFWGWBP | KUMOUSEAST1BC | Business Critical |
Once the region matches, share the established share with the respective Kumo Snowflake account.
Security Considerations
The following are key security considerations for understanding Kumo’s access mechanism to Snowflake instances.
-
Database Creation on Kumo’s EndKumo will generate a new database from the share received. It will serve as the central space for EDA and PoV operations.
-
Defining Access RolesA unique ROLE will be crafted within Kumo’s database to guarantee secured data access.
-
Allocating Exclusive AccessAccess is provided only to Kumo’s designated Point of Contact, safeguarding your data.
Monitoring Access and Activities
-
Queryable Audit Trails: Use
SHARE_USAGE
,QUERY_HISTORY
, andLOGIN_HISTORY
views within Snowflake to review Kumo’s interactions. -
Role-Based Access Control: Monitor the unique ROLE for Kumo to ensure compliant data access.
-
Data Manipulation Monitoring: Use Snowflake’s query history to document any changes made by Kumo.
-
Scheduled Audits: Regularly check logs, role permissions, and shares to guarantee data safety and accuracy.
Contact your Kumo support if you would like to export your predictions back into your Snowflake instance.
Kumo connects to your Snowflake data warehouse using a Snowflake Connector. Creating a Snowflake connector in Kumo requires the following steps:
Step 1: Grant the privileges required
The following is the minimum set of privileges that should be granted to the Kumo Snowflake app to successfully create a Snowflake connector.
Note
-
The following commands should be run by a user who has
OWNERSHIP
orWITH GRANT OPTION
privileges on the objects (warehouse, database, schema and tables) being granted access to. -
The objects used in the commands (warehouse, database, and schema) must be the same as those used to create the connector (in Step 2 below).
-
See Warehouse sizing table below for the size of the warehouse to use.
Step 2: Creating the Snowflake connector
- To set up a new Snowflake connector, click on Connectors in the left-hand column, followed by the Configure Connector button on the “Connectors” page.
- On the “Snowflake Connector” window, provide a name for your Snowflake connector and add the following connection details. The necessary privileges (described above) must be are granted to the Kumo application before this step.
-
Account Identifier - The account identifier uniquely identifies your Snowflake account. This should be provided as
ORGNAME-ACCOUNT_NAME
. TheORGNAME
andACCOUNT_NAME
can be retrieved for your Snowflake account using the instructions here. -
Database - The Snowflake database where the input relational data exists (same as the one in Step 1).
-
Warehouse - The warehouse that will be used to read and process data in Snowflake (same as the one in Step 1).
-
Schema Name - The schema under the Database where the input tables are to be loaded from (same as the one from Step 1 but should not be prefixed with the Database name).
Click on the Done button to save your new Snowflake connector.
Snowflake Data Warehouse Sizing
Kumo recommends the following data warehousing sizes for the Snowflake Native app based on the size of your largest table.
Largest Table Size | Snowflake warehouse Size |
---|---|
Up to 10 GB (10s of millions of rows) | Medium |
10 to 100 GB (100s of millions of rows) | Large |
Greater than 100GB | Currently not supported in Kumo’s Snowflake Native app. |