News/ Product /

Recommending Products and Personalizing Experiences with Kumo

May 19, 2024

Kumo team

ML-powered recommendations can be useful to your business in many ways. It can come in the form of a personalized homepage showing the user highly relevant content, or showing them a curated list of products they are most likely to be interested in. You can read more about different applications of ML-powered personalization in our blog. You can also read our deep dive on why graph learning is particularly effective at solving this type of problem.

This blog will show you how to go from raw data to predictions you can use to directly build personalized experiences or enable highly effective recommendations.

Building your Graph

Your enterprise data naturally is represented by a graph – the entities include users, transactions, clicks, views, and attribute data. This graph includes the patterns and interactions between the entities that represent users, purchases, actions, and more.

Kumo lets you quickly connect your data sources to build the graph in a matter of seconds. This graph is built in a way that’s optimized for GNN learning.

To recommend products to purchase, you can include tables that are related to purchases, subscriptions, anything applicable to user history and activity, etc. Online marketplaces and ecommerce platforms can include data on the products or items sold. The more data the better!

You only need to build your graph once – once you do that, there are a near infinite number of use cases you can tackle, and you can start asking many different questions about your data.

In this example, we’ve added tables including anonymized customer information, which can analyze the ecommerce platform’s customers and their transactions. Link your tables based on Primary / Foreign Key relationships and you’re done!

Once you connect your tables, Kumo will automatically build the graph optimized around the relationships and interactions that exist, so making recommendations becomes easy. Things like what users are similar based on demographics and past behaviors, what items are likely to be purchased together or sequentially, what price ranges users typically focus on, are inherently learned through the graph structure.

You can scale to dozens of tables, terabytes in size, with tens of billions of rows collectively. With a large graph created, Kumo will enable you to learn complex patterns and interactions from the different schemas – the more tables you connect, the greater the depth of the learning.

Now, you can reuse this same graph to build any number of predictions across a broad range of use cases. Kumo.ai will maintain and update the graph as new data comes in or as tables are updated.

From Business Problem to ML Pipeline in A Few Lines of Simple Code

With your graph built, you can now start generating predictions! Kumo offers a flexible and customizable language called the Predictive Query, which specifies the machine learning problem you’re trying to solve.

Kumo uses the query to kick off an end-to-end ML pipeline (no feature engineering or pipelines needed!) where the Kumo AutoML process runs under the hood, identifying the best architecture, model, and parameters for your task and graph. You can run a near-infinite number of predictions for different types of problems in a matter of hours, each one deploying a new, unique GNN built for the specific task.

The following will predict the purchases each user will make specifically over the next 7 days:

PREDICT LIST_DISTINCT(transactions_train.productID, 0, 7)
FOR EACH Users.ID

Once you execute this query, Kumo will generate training labels from past data across multiple timestamps to train and evaluate a GNN across time, automatically determine the best training strategy (in this case, Ranking), then perform GNN architecture and hyperparameter search, and deploy this end-to-end.

The result will be top recommendations for each of your users!

Now that you have your first personalization query, you can make subsequent predictions on the graph.

You can rerun the query, but only for the users who have had transactions on your platform over the last 90 days as follows:

PREDICT LIST_DISTINCT(transactions_train.productID, 0, 7)
FOR EACH Users.ID WHERE EXISTS(Trans.*, -90, 0)

Given the transactions table, you can also build queries to understand other aspects of your user preferences. For example, the following query will predict the price band preference for all users (note this is predicting their preference for the next 90 days, which is influenced by but not identical to the historical purchase prices of previous transactions in your training data)

PREDICT AVG(transactions_train.Amount, 0, 90)
FOR EACH Users.ID

From here, you can leverage the predictions directly, or take the learned embeddings and use them to improve performance of existing models!

You can reach out to us.