How to optimize the costs of using AI at your online marketplace
SmartCat
|
Mar 11, 2024
Share
This age will be remembered as the age when AI spoke, sang, and painted. Yes, even though AI has been present for years in various forms (predictions, recommendations, analytics, your Google maps…) – it’s only come into focus through generative AI technologies.
Large Language Models, such as GPT, can effectively mimic human language. That’s not their only use, however.
The algorithm behind it can summarize and analyze swathes of data in seconds. It can analyze product reviews, large chunks of text and messages to detect potential fraud attempts.It provides near-human-like support, but also images and paintings. It enables advanced search that doesn’t rely on exact matches, but on intents and similarities (more about in the article).
However, cutting edge always comes at a hefty cost and every ounce of optimization matters. In this article, you’ll discover how much it costs to run an LLM and why simple token pricing isn’t enough.
The reality of costs of using LLMs in online marketplaces
Major players are already implementing Large Language Models (LLMs) and generative AI. Chances are, you too are experimenting with AI-driven support, search, fraud protection and other functionalities.
However, the bigger the business, the more it can afford experimentation and increased budgets. Even though some implementation mechanisms might not be the most effective solution, they can afford it longer.
For any other player that’s not a big corporation like Amazon or Alibaba, it’s imperative to consider every optimization parameter, especially as the volume of data continues to grow.
When your users type in requests (prompts) it takes resources to compute and get responses. According to the official OpenAI website, 1M tokens for their best model (GPT-4) cost $10 for input and $30 for the output.
To make things a bit clearer, 1000 tokens is around 750 words.
In other words, to input 750 words and get 750 words in return, it will cost you $0.09. This may not sound much if your users are the only ones who communicate with the AI. And if they’re doing it once per month…
Yet, GenAI on marketplaces usually entails frequent communication in the backend, especially if chatbots aren’t the only tool you’re implementing.
Fine-tuning means you’re taking a pre-trained model and inserting your own data sets to train it to your needs. This costs around $24 per hour. It might take you just 15 minutes… Or 72 hours to create your fine-tuned model, which is 72 x $24 = $1728
The benefits of fine-tuned models is that you get cheaper and faster responses from the model. Simple prompting requires a few shots before you get the response you wanted. Tuned models are already trained to give out more precise responses.
You’ll need to consider both the number of people on your website per month, their behavior, and available infrastructure.
Implementation costs
There are multiple standard ways to implement LLMs on your marketplace. While some may have opted in for training and developing models on their own data (and you might want to consider that OpenAI spent $3.2 million to train chat-GPT) – in reality the vast majority integrates an out-of-the-box, open-source LLM solution and uses their own data to fine-tune it to their needs.
It requires fewer development resources, and it’s much faster. Yet, even in those cases, there are many issues that cause hidden costs.
Scalability issues
As online marketplaces grow, so does the volume of data. The bigger the volume, the more processing power you need.
Database and resource control
Not all databases are optimized for AI operations. Traditional relational databases might not be the best fit for storing and retrieving heavy data with many properties required by LLMs.
These databases can lead to slower query times and imprecise recall, affecting the quality of response.
Speed issues
Improper infrastructure and database choice means the response won’t be fast or precise enough, which leads to customer dissatisfaction and directly affects revenue.
Vector databases provide huge cost optimizations for AI-driven marketplaces
We already have solutions on the market that can handle many of these issues. One of those are the so called vector databases.
For this AI tech to work (so that’s cost effective), the data needs to be stored efficiently. Data used today is high dimensional and multimodal (text, video, image, audio) and this makes it very hard to process and retrieve from traditional databases (rows).
Illustration: On the left, this is how VDBs store information vs traditional databases
In other words, think of it as taking the essential information about the data and transforming it into a packaged, compressed form.
Traditional databases store data in rows and columns and you have to use an exact match search to go through it. With the rising volume of data, the entire system slows down because it processes so many properties.
With the slower processing come greater costs.
In vector databases, the search is so much faster as it allows for searching for “packages of data” and similarities between packages. Instead of going through rows and processing all of it, the machine takes a finite set of parameters, finds the similar package, and retrieves that. The so-called similarity search is the backbone of success of these types of databases.
For example, if a user types a query searching for “small puppies”, a traditional database would return exact keyword matches – everything that contains the phrase small happy puppies. If the DB is huge, it would take a while. Especially if the database has many breeds of puppies, toys and equipment for them…
A vector database would understand the context of the query, your intent behind it, and present you with all the smiling, happy puppies. If the user is on the page for a specific breed, it would recognize that as well.
Illustration: Information in the VDB stored closely by their similarity
Look at the illustration above. Instead of running through thousands of rows (both cars, dogs, and puppies), in the VDB, the algorithm would find and retrievethe nearest neighborwith a similar set of information.
Using vector databases in synergy with LLMs, where you send only the right packages to the LLM from a sea of data, might save you a fortune in token expenses, but more on that later.
The almighty similarity search in online retail as revenue engine
The key feature vector databases really excel at is similarity search. As we have seen in the example above, similarity search is a search algorithm used to find data points that are most similar to a query. Instead of searching for exact matches, it aims to find items that are closest or most similar to the query based on certain metrics or criteria.
In online marketplaces, where users are constantly seeking products or services that match their preferences, the ability to efficiently and accurately find similar items makes a difference between buying and bouncing.
When dealing with data with many data points – visual, audio, or text – it really provides a tremendously pleasant experience for shoppers.
Now imagine a user browsing an online clothing store and coming across a shirt they like, but it’s not quite the right color or pattern. With optimized visual search, they can upload a picture of their desired shirt, and the system will find items that visually match the uploaded image by many parameters. This enhances the shopping experience by allowing users to find products based on visual cues, not just textual descriptions.
Or let’s consider a user searching for a “lightweight summer jacket.” Traditional search might return jackets that are either lightweight or meant for summer. However, similarity search, combined with a semantic understanding of AI algorithms, understands the context and intent behind the query. It will prioritize jackets that are both lightweight and suitable for summer, ensuring that the results are more aligned with the user’s intent.
In a scenario where a user wants to find a product using both text and image, multimodal search comes to the rescue. They might upload a picture of a dress and add the text “in blue color.” The system will consider both the image and the text to return blue dresses that match the style of the uploaded image – and much faster because these characteristics are already made to be easily searchable within the vector database.
To summarize the benefits:
By understanding the context and nuances of queries, similarity search ensures that the results are more accurate and relevant to the user’s intent.
It allows users to search using various inputs, be it text, image, or a combination of both, offering a more flexible and intuitive search experience.
Vector databases are optimized for data with multiple parameters (high-dimensional data), ensuring quick and efficient retrieval of similar items.
By returning results that closely match user preferences, similarity search offers a more personalized shopping experience, leading to increased user satisfaction and higher sales. Users sift less through irrelevant results, leading to a smoother and more efficient shopping journey.
Similarity search, with its various forms and benefits, is transforming the way users interact with online marketplaces. By offering more precise, context-aware, and visually aligned results, it ensures that users find what they’re looking for with ease and efficiency.
Technical POV: Savings that come from the interplay between vector databases (VDBs) and Large Language Models (LLMs)
While LLMs provide the capability to understand and generate human-like text, VDBs ensure efficient storage and retrieval of enormous volumes of data. But how does this combination help in cost optimization?
One of the significant expenses associated with LLMs is token costs. Tokens are a unit for a word, text or phrase. The more you process them, the more tokens it requires.
By integrating with VDBs, businesses can store frequently used responses or data as “embeddings” (simplified packages of data). When the user makes a similar query, instead of regenerating responses using LLMs, the system can retrieve the stored embeddings from the VDB.
Even when the connected LLM needs to process something again, you still send a “compressed package” of data, and by doing so, you reduce the number of tokens used.
Scale whenever you want
VDBs can dynamically scale based on needs. This means that as the volume of data grows, you don’t need to make significant infrastructure investments. Your LLM operations remain consistent and cost-effective, regardless of the data volume.
No more over-provisioning or under-used resources. Vector databases adjust in real-time to the demands of the system.
Reduce dependency on commercial solutions
By using VDBs and storing your embeddings, you can actually reduce your dependency on commercial LLM solutions. This can lead to significant cost saving if you have high query volumes, which is the case on online web shops with millions of users.
Direct business POV: How this way of AI implementation drives further revenue
Every decision related to technology, and user experience has a direct impact on your bottom line. Implementing advanced similarity search and vector databases isn’t just about optimizing your online marketplace in the backend. In fact, it’s more about the forefront, where real people use it.
Direct impact on revenue is what offsets the costs.
In a saturated market, offering advanced search functionalities can set a platform apart from its competitors, attracting more users and vendors to the platform.
Optimize the use of AI in your online marketplace today
It’s never too late to cut down unnecessary expenses. Yet, we’re aware that this process takes time. Our advice is to do pilot testing and start with micro experiments. That way, you’ll see the benefits and avoid major problems.
Implement training and workshops for your team, because this type of technology requires thoughtful education. Implement feedback loop processes within the marketplace, so the customers can also tell you what you’re doing right, and what could be different.
You know there’s nothing better than a satisfied customer who leaves a positive review or writes an unsolicited praise to the customer support.
We are also partners with the hottest vector database vendors on the market, Pinecone and DataStax. Through our workshops, you’ll not only be getting the infrastructure but also training and overall gen AI clarity.
On top of that, we’ve been implementing AI-based solutions for quite some time now and we know how to optimize your current systems.
Contact us today and grab this 360 deal. Stop the leaks in your budget or build a solid system from the ground up.
Stay up to date!
Stay at the AI frontier. Explore, learn, and subscribe for the latest in tech trends and advancements!
This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.
Strictly Necessary Cookies
Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.
If you disable this cookie, we will not be able to save your preferences. This means that every time you visit this website you will need to enable or disable cookies again.
3rd Party Cookies
This website uses Google Analytics to collect anonymous information such as the number of visitors to the site, and the most popular pages.
Keeping this cookie enabled helps us to improve our website.
Please enable Strictly Necessary Cookies first so that we can save your preferences!