|
Mar 12, 2025
In the dynamic world of e-commerce, search functionality is a crucial component of the user experience. When customers type queries like “blue Nike Air Force 1” into a search bar, they expect instant, accurate results. However, many existing language models (LLMs) designed to generate these queries are either too expensive or not optimized for real-world search behavior.
That’s why we’re proud to unveil our latest innovation by our ML team (Mentor: Milutin Studen, Engineers: Petar Surla, Andjela Radojevic): an open-source machine learning model designed to generate user queries for e-commerce platforms. Built on the foundation of the T5 model, this solution delivers more natural, concise, and effective search queries, helping users find exactly what they’re looking for.
E-commerce platforms rely heavily on user queries to connect customers with products. However, many existing query-generation models produce overly literal or unnatural queries, such as “What shoe sizes does Nike Air Force 1 have?” instead of more intuitive queries like “blue Nike Air Force 1.” This disconnect can result in building models and search systems with poor quality, ultimately leading to a frustrating user experience. These generated queries often fail to effectively enhance search capabilities.
To tackle this problem, we developed an open-source model designed specifically to generate realistic, user-aligned queries that reflect actual search behavior. Our model helps build better datasets, improve query suggestions, and enhance the overall performance of search systems, offering a cost-effective and high-quality alternative to expensive solutions.
For fine-tuning, we started with a pre-trained T5 model specifically designed for query generation.
Through extensive experimentation, we developed several iterations of the model, each optimized for different input configurations.
Here are the key models we created:
After testing, T5-GenQ-TDC-v1 emerged as the top performer, consistently generating user queries that align with real-world search behavior.
The project was divided into four main phases:
This model generates user queries that are significantly better than those produced by the base model. This improvement has the potential to improve search functionality on e-commerce platforms, making it easier for users to find the products they’re looking for.
To assess the performance of our fine-tuned query generation model, we conducted an additional experiment on a dataset containing real user queries, which was not part of the fine-tuning data. The goal was to verify the model’s performance and effectiveness on real user queries for e-commerce products. The fine-tuned model outperforms the base model, which indicates that the fine-tuned model generates queries that are more similar with the real user queries, making it a better fit for e-commerce applications.
After fine-tuning the model, the next steps involve evaluating its performance in different ways, experimenting with different configurations, and adapting it for new tasks. Deploying the model into real-world applications and continuously improving it with fresh data are also key. These ongoing iterations will ensure the model remains effective and continues to improve over time.
If you’re as excited about this project as we are, you can explore the details yourself! Check out the following resources: