Skip to main content

Enhancing E-commerce Search with Sentence Transformers

 Product search is a critical feature for any e-commerce platform, as it directly impacts sales and revenue. The effectiveness of the search function depends on how well it generates diverse and relevant results based on customer queries that plays a key role in meeting customer needs and driving conversions.

Most of the e-commerce websites are still using old fashioned search that is driven by traditional engine such as solr, elastic search etc. Imagine a smart AI based search portal which is capable to generate much diversified results instead of just relying on product descriptions and other indexed information. 

Here is a Proof of Concept (PoC) for an enhanced search portal that acts like a virtual shopkeeper, delivering more diverse and relevant search results by accurately understanding the intent behind each query.

One of the best suitable options available is semantic search There are numerous AI based models to achieve such tasks and sentence transformers are one of those. Sentence Transformers or Sentence BERT has been built on existing transformer models such as BERT (Bidirectional Encoder Representations from Transformers). These models process text using self attention mechanisms to understand the relationships between words in a sentence, regardless of their position. If you want to know more about how BERT works, feature vectors, self attention mechanism etc, please refer other posts on this blog and page Understanding Transformers (BERT & GPT).

Data Setup

First, I would create a sample dataset that includes product name, description and categories those belong to. Clean and accurate data is the first perquisite which the search relevancy depends on.


Create Embeddings

Import required library for sentence transformers. The below code snippet will generate the vector embeddings for the above dataset. Note, in NLPs, the input text is first tokenized and converted in feature vectors(a numeric representation of all tokens) for further processing instead of directly being used. Here I have used the prebuilt sentence transformer (all-MiniLM-L6-v2), but you can always finetune it based on your business needs and have a custom version of it.



Running Search

Once embeddings have been created, you can process those to similarity search algorithms to get the semantic search results. Here, I have used FAISS(Facebook AI Similarity Search) for that purpose. You would need to install it using pip if not available.



I have put a query string "I want some good breakfast options" as search string and I get the below results. These results are searched based on the semantic similarity of search keyword entered. It doesn't need data to be indexed accurately to be pulled up.  Sentence Transformer model has been trained with wide variety to text corpuses in such a way that it smartly identifies the contextually relevant data.

If you notice, I have displayed calculated scores as well (calculated based on the cosine similarity algorithm). Higher the score, much similar the results are. We can decide any score threshold saying above this score is directly relevant data and lower score can be some additional suggestions for the search results.


This is a basic application of Sentence Transformer models. These pre-trained models can be further finetuned to align more closely with specific business needs and domains. To explore how finetuning works in detail, please refer  Finetuning of Transformers in Natural Language Processing.





Comments

Popular posts from this blog

Exploring CNN with TensorFlow & Keras

Convolutional Neural Network or CNN for short, is one of the widely used neural network architecture for image recognition. It’s use cases can be widely extended to various powerful tasks, such as, object detection within an image, image classification, facial recognition, gesture recognition etc. Indeed, Convolutional Neural Networks (CNNs) are designed with some level of resemblance to the image recognition process in the human brain. For instance, In the visual cortex, neurons have local receptive fields, meaning they respond to stimuli only in a specific region of the visual field, which is achieved by CNN using kernels or filters. Both human brain and CNN process the visual information in hierarchical manner. Basic information of an image is extracted via lower level of neurons in human brain, and higher-level neurons integrate the information from lower-level neurons to identify the complex patterns. On the other hand, in CNN, we use multiple convolutional layers to extract hiera...

Finetuning of Transformers in Natural Language Processing

Transformers are the essential parts of deep neural network, and widely used in Natural language processing tasks. We have a wide variety of usages where transformers are used in real time scenarios, such as, translations, text generation, question answering and various other NLP tasks. One of the widely used examples of transformer is Chat GPT. More information about transformer architecture and its mechanism can be accessed on page Understanding Transformers (BERT & GPT) . One of the very important processes in transformers is Finetuning. Finetuning is the way for adapting the OOB (out of the box) model for your specific tasks. In other words, it is the process of training a pre-trained model on your specific datasets to adapt the knowledge from new dataset. During fine-tuning, the parameters of the pre-trained model are adjusted based on the task-specific dataset. The goal is to adapt the model’s knowledge to perform well on the particular task of interest. Let’s understand how ...

Tuning Hyperparameters and visualizing on TensorBoard

Hyperparameters tuning is one of the most crucial steps of machine or deep learning process. Hyperparameters are configurations for a machine learning model that are not learned from the data but are set before the training process begins. These parameters are essential for controlling the overall behavior of the model. While training a machine learning model, you may have to experiment with different hyperparameters such as learning rate, batch size, dropout size, optimizers etc. in order to achieve the model with best accuracy. Performing experiments with hyperparameters one by one can be a tedious and time-consuming process. For instance, you initiate the training process with a specific combination of hyperparameters, and subsequently, you repeat the procedure with a different set of hyperparameters, and so forth. TensorFlow allows you to run experiments with different sets of hyperparameters in a single execution, enabling you to visualize the metrics on HParam dashboard in Tenso...