Kumo AI Wins Most Innovative AI Technology at 2024 A.I. Awards! Learn more

09/20/2024

Build a Personalized Chatbot Powered by Kumo AI Embeddings and Predictions

by Naisha Agarwal

Follow along in my video and learn more about how Kumo AI can power chatbots to improve predictions.

Motivation

Imagine you are a user entering a store, looking for items to purchase. Having a personal assistant that can give you customized real-time recommendations for how to find the best item given their past purchase history and constraints can be extremely beneficial to a user’s shopping experience.

The queries a user can ask an agent of this kind can be divided into two main categories:

  • ItemToUser
  • ItemToItem

An itemtouser query is the one that we are all most familiar with; what items should I buy given my personal purchase history? These are tailored queries for a specific customer, suited to their preferences. An itemtoitem query is not specific to any one customer; these are more generic queries about what items go well with each other, what items complement each other, etc. Having a chatbot that can answer both of these types of queries can enhance the user experience significantly.

Overview

There are several steps in order to build a chatbot like this:

  1. Obtaining Kumo AI predictions and embeddings
  2. Storing data in a database
  3. Understanding what type of context to use for different types of queries
  4. Using a RAG approach to pass in appropriate context into a LLM and obtain results for specified queries
  5. User Experience/Product
  6. Setting up an Evaluation Framework

This blog will be going into depth about each of these sections.

1. Obtaining Kumo Predictions and Embeddings

Our chatbot can answer two types of queries: itemtoitem and itemtouser. Let us look at each of these cases separately.

ItemToUser

For an itemtouser usecase, we want to obtain the top k item predictions for every customer. Kumo makes this very easy to obtain.

We use the publicly available H&M dataset . The original dataset has 3 datasets:

  • Articles (information about the store inventory)
  • Customers (information about the store’s users)
  • Transactions (list of the different purchases made at H & M)

These three datasets serve as the tables we can use to construct our Kumo graph. We link the article_id and customer_id from the articles and customers table respectively to their foreign_keys in the transactions table as shown below.

With this graph, we can run the following PQuery:

PREDICT LIST_DISTINCT(transactions.article_id, 0, 30, days)
RANK TOP 25
FOR EACH customers.customer_id

This PQuery obtains the top 25 items for every customer ID over the next 30 days. Once training the Kumo model using this PQuery, we can run a batch prediction workflow, and obtain two main pieces of data:

  • Predictions: The top 25 predictions for every customer id in our dataset
  • Embeddings: Embeddings for every customer ID, as well as an embedding for every item in the database

ItemToItem

For the itemtoitem use case, our setup looks a bit different. We have two articles tables, duplicates of each other. In this use case, we can think of the nodes in our constructed graph as the items, and an edge between any of the nodes as a co-occurence relationship. Both of these articles table then link to a transactions table, that we can think of a co-occurence table. The graph can be seen below:

We can run the following PQuery:

PREDICT LIST_DISTINCT(transactions.article_id2, 0, 30)
RANK TOP 10
FOR EACH articles.article_id

Once training the Kumo model using this PQuery, we can run a batch prediction workflow, and obtain the following pieces of data:

  • Predictions: The top 10 predictions for every item in the dataset
  • Embeddings: The embeddings for both the left hand side articles table, and the right hand side articles table

After running both of these batch prediction workflows, we now have the predictions and embeddings for both the itemtouser and itemtoitem use cases.

2. Storing Data in Database

Now that we have the Kumo predictions and embeddings, our next step is to store the data in a database in a format that is easily searchable. I will be using Elasticsearch to store the data.

Elasticsearch is structured into numerous indices, each with their own specific index mapping. For this application, there are a total of 8 indices in our database: item information, customer information, embedding indices (x4), and prediction indices (x2). Let us investigate each of these indexes further.

Item Information Index (#1)

This is an index for storing basic information about every item, taken directly from the H & M dataset. Here, instead of storing each attribute as it is stored in the original dataset, I combined attributes into one field that is easily searchable. This way, users can also query by item name or any other item attribute instead of having to specify the item id in the query.

For instance, consider the following item from the original dataset:

{“article_id“: 108775015, “prod_name“:”Strap top”, “product_type_no“:253, “product_type_name“:”Vest top”, “product_group_name“: “Garment Upper body”, “graphical_appearance_no“:1010016, “graphical_appearance_name“:”Solid”, “colour_group_code“:9, “colour_group_name“:”Black”, “perceived_colour_value_id“:4, “perceived_colour_value_name“:”Dark”, “perceived_colour_master_id“:5, “perceived_colour_master_name“:”Black”, “department_no“:1676, “department_name“:”Jersey Basic”, “index_code“:”A”, “index_name“:”Ladieswear”, “index_group_no“:1, “index_group_name“:”Ladieswear”, “section_no“:16, “section_name“:”Womens Everyday Basics”, “garment_group_no“:1002, “garment_group_name“:”Jersey Basic”}

When storing this data into Elasticsearch, we simplify this item into simply two columns:

{“item_id”: 108775015, “item_info”: Item 108775015 is called Strap top. It is a Vest top (product type number 253) under the Garment Upper body product group. This item has a graphical appearance of Solid (appearance number 1010016) and belongs to the Black colour group (colour group code 9). It is perceived as Dark (perceived colour value ID 4) and categorized under the Black master colour (master colour ID 5). The item is in the Jersey Basic department (department number 1676) of Ladieswear (index code A). Specifically, it is in the Womens Everyday Basics section (section number 16). The garment group is Jersey Basic (garment group number 1002).”}

Customer Information Index (Index #2)

This is an index for storing basic information about every customer, taken directly from the H &M dataset. Similarly to the item information index, instead of storing every single attribute as a single field, I combined these attributes into a singular customer_info field.

For instance, given the following customer data:

{“customer_id“:”00000dbacae5abe5e23885899a1fa44253a17956c6d1c3d25f88aa139fdfc657″,” FN“:null, “active“:null, “club_member_status“:”ACTIVE”, “fashion_news_frequency“:”NONE”, “age“:49, “postal_code“:”52043ee2162cf5aa7ee79974281641c6f11a68d276429a91f8ca0d4b6efa8100”}

we store this in Elasticsearch with two columns:

{“customer_id”: 00000dbacae5abe5e23885899a1fa44253a17956c6d1c3d25f88aa139fdfc657, “customer_info”: Customer with ID 00000dbacae5abe5e23885899a1fa44253a17956c6d1c3d25f88aa139fdfc657 is a ACTIVE member. They are 49.0 years old. Their fashion news frequency is set to ‘NONE’. Their postal code is ‘52043ee2162cf5aa7ee79974281641c6f11a68d276429a91f8ca0d4b6efa8100’. Their active status is nan and FN status is nan.}

Storing the data in this format into Elasticsearch makes it easily searchable, and allows for users to use other customer attributes when searching.

Embedding Indices (Indices 3-6)

To store the embeddings for itemtouser, we will define two separate indices: one for customer embeddings and another for the item embeddings.

Below are the structure of these two indices:

Both indices have the same structure of ID (customer or item), the embedding, and a text field describing either the customer or the item.

A similar structure is used for the itemtoitem embedding indices, where both embedding indices in this case are of the same structure as the cust_item_embeddings above.

Prediction Indices (7-8)

For both itemtouser and itemtoitem, we store two indices: predictions for every customer, and predictions for every item.

Item Predictions

Customer Predictions

Similar to the item and customer information indices, predictions for every item/customer are formatted into one singular text field so that all predictions can be searched easily for any given item.

{“item_id”: 108775015, “formatted_predictions”: “The 1st prediction for item_id 108775015 is item_id 108775015 with score 317.7856140136719. Item 108775015 is called ‘Strap top’… The 2nd prediction for item_id 108775015 is item_id 815456006 with score 238.77734375. Item 815456006 is called ‘Madison Slim Stretch Chino’… The 10th prediction for item_id 108775015 is…”

{“customer_id”: 00000dbacae5abe5e23885899a1fa44253a17956c6d1c3d25f88aa139fdfc657, “formatted_predictions”: “The 1st prediction for customer_id 00000dbacae5abe5e23885899a1fa44253a17956c6d1c3d25f88aa139fdfc657 is item_id 108775015 with score 317.7856140136719. Item 108775015 is called ‘Strap top’… The 2nd prediction for customer_id 00000dbacae5abe5e23885899a1fa44253a17956c6d1c3d25f88aa139fdfc657 is item_id 815456006 with score 238.77734375. Item 815456006 is called ‘Madison Slim Stretch Chino’… The 10th prediction for customer_id 00000dbacae5abe5e23885899a1fa44253a17956c6d1c3d25f88aa139fdfc657 is..

3. Context Retrieval

Once data is structured in Elasticsearch, we have to figure out an appropriate way to retrieve context given a user query, and display a final response. The process can be summarized as the following:

Classification Response (OpenAI)

The first step in the workflow is a classification response of the user query, either identifying it as an itemtoitem or an itemtouser prompt. This will help specify what indices to search for in Elasticsearch for relevant context for the user query.

This classification response is obtained by simply prompting GPT 4-o with the prompt “Please classify the following query as either an itemtoitem task or a itemtouser item task. Please only respond with either itemtoitem or itemtouser, with no additional commentary.”

Context Retrieval (Elasticsearch)

Now that we have classified the query as either itemtoitem or itemtouser, we have reached the most important part: retrieving the appropriate context. We have two main pieces of information stored in Elasticsearch: predictions for both customers and items, and embeddings.

Let’s start with retrieving the appropriate predictions! Using the classification response from GPT, we can direct the query to the appropriate indices depending on the classification.

  • ItemtoItem: item_information, item_predictions
  • ItemtoUser: customer_information, customer_predictions

For the specified indices for the given task, combine all relevant context.

Given an index and a user query, how is the appropriate context retrieved?

  1. Searches for the query text in all fields in the given index (which is either the formatted_predictions field or the info field depending on the index)
  2. For every “hit” found in the index, append to our context the relevant information (either the formatted_predictions field or the item_info field).
  3. Return context!

See below for an example:

Context Retrieval (Embeddings)

Now, let us think about a query like “I just bought the Mariette Blazer. Recommend me a white item to buy” or “I am customer x. Recommend me a sweater I can buy based on my purchase history”.

These queries can be powered with Kumo predictions. We will search for the predictions for the specific item/customer, and then search for the specific attribute within the predictions (ex. white items in the predictions for Mariette Blazer or sweaters in customer x’s predictions)

However, how about if there isn’t a white item in the predictions for the Mariette Blazer, or there isn’t a sweater in customer x’s predictions? This is where the embeddings come in. Given a specific attribute/filter (ex. white, sweater), instead of searching for the attribute within the pre-computed predictions, we can use embeddings to do a filtered search.

The workflow for retrieving embeddings can be summarized into four major steps:

  1. Input: User Query, Classification Response from GPT, User Filter
    1. Ex. Query: “I am customer 05373b97aa6d6e1e2be3ee243b0c6d060287db1de0118a27f0da4543236b3a02, I am looking for a white sweater”, Classification: “ItemToUser”, Filter: “White Sweater”
  2. Retrieve the appropriate embeddings for the item/customer specified in the query from the left hand embedding index (the user embeddings for a itemtouser query or the left hand side item embeddings for a itemtoitem query)
  3. For the retrieved embeddings, do a KNN search with the embeddings in the right hand side index that match the given filter
    1. The filter is over the item description in the right hand side index.
    2. Ex. retrieves customer_embedding from LHS index, does KNN search with items in RHS index with item description matching “white sweater”
  4. Append information for every top k embedding to context (taken from the item information index). Each of these items will have the attribute specified by the user (ex. white sweater).

4. Retrieval Augmented Generation (RAG)

So far, we have:

  • Classified the user query as either itemtoitem or itemtouser
  • Used this classification response to retrieve context either using predictions or embeddings if a specific attribute/filter is specified

Now that we have the relevant context, how can we use this to obtain a coherent response to display to the user? This is where RAG comes in.

The RAG approach can be broken down into three simple steps:

  1. Input: User Query, Relevant Context
  2. Construct final prompt {”question”: user_query, “context”: context}
  3. Call LLM with the final prompt, and that’s your final response!

5. User Experience

Now that we have our final response, we want to display it to the user, and have an easy way for the user to interact with our agent. Streamlit provides a simple, elegant UI for users to interact with and get responses to their queries.

Below is a diagram of the key features of the UI:

6. Evaluation

In order to test the accuracy of the responses returned, I created a test dataset of 50 queries: 25 itemtoitem and 25 itemtouser, varying in length and style.

Overall, the agent had 90% accuracy (24/25 for itemtouser and 21/25 for itemtoitem)!

On average:

  • Classification response by GPT is 98% accurate (1/50 query misclassified)
  • For the itemtoitem queries misclassified, the items listed match the best Kumo recommendations with Jaccard index 0.67 (8 out of 10 match)

Below is an example of how the evaluation was set up:

This chatbot is just one of the many cool things that can be built using Kumo AI!

Next step: request a free trial to give it a shot