PCG logo
Case Study

The right thing to say to "Honey, what’s for dinner?"

The Challenge

Our Client, one of the global players in online food ordering domain, wanted to change the way users search for restaurants. Currently, the website offers restaurant and zip code based searches to filter results. In order to stay competitive and improve user engagement, search behavior had to be enhanced with product based searches. Also, from the end user’s perspective, the burden of filtering restaurants by cuisine and then by items, is overwhelming. Wouldn’t be easier for the user to just type in what he/she is craving for, then the search returns the relevant restaurant list?

That is when PCG was mandated to create a proof of concept and demonstrate the benefits of using Elasticsearch to improve the search feature’s usability.

The company manages restaurants and products by serving them from a relational database that hosts around 12+ million products from restaurants worldwide. With no unique naming convention for products, there were .approx 3+ million redundant products with varied spelling changes. For example, “Pizza Margharita” and "Pizza Margarita” are semantically the same, but spelled differently with two product ids belonging to two different restaurants.

The challenge was to group products that are semantically close enough, so the search returns relevant results even with misspellings in the input text. Since the company has its operations spread across countries, the search is not just confined to one language. The database holds the global restaurant data, which would mean the strings we look for might contain language specific letters and translations. This doesn’t end with just product searches, the user should also be able to search by restaurant name if the searched term matches a restaurant.

On the ETL part, extracting relevant product data from the relational database, transforming and indexing Elasticsearch had to be done by defining appropriate Logstash configuration and document mappings for efficient search performance. As this could become the next big feature in our Client’s product pipeline and also by keeping scalability in mind, we decided to introduce modern software tools and workflows to deliver the feature our Client has envisioned.

The Solution

The architecture that we developed for the ETL process uses Logstash, which is an open source, server-side data processing pipeline that ingests data from various sources, transforms and sends it to the defined Elasticsearch index. Apart from collecting data from different systems, Logstash does something even more important: it normalizes different schema to one single format, which can be fed to the Elasticsearch index. The relevant data from the database was analyzed and queries were defined within the Logstash configuration and AWS Redshift JDBC driver was used to connect to the hosted database. Reimporting everyday database changes was done by setting appropriate relational queries that pull the latest data and the operation was automated to keep the Elastic index in sync with the primary database.

In Elasticsearch (ES), every document was identified by its product id and had all relevant details indexed for the search. Since the product data was not confined to one language, multi-language and phonetic analyzers were used to improve search relevancy. Also, in order to facilitate wildcard searches, ngram tokenizer was implemented in the document mapping. In case of changes or additions in the product detail, modifications can be easily applied to the ES index with an upsert operation. This makes sure that the update happens only if the document already exists, identified by the product id, otherwise indexing will perform an insert of the newly found document.

After structuring the query DSL to extract restaurants that offer the searched product based on a zip code, a frontend UI was developed in React JS to visualize the new search feature. The implementation can be extended in a way that users can create custom queries based on the multiple product attributes and metrics, such as distance to a restaurant, delivery time, rating etc.

Finally, terraform was chosen to provision the entire stack on AWS (Amazon Web Services) and the pilot project was conducted.

Results and Benefits

Along with the proposed architecture, a new global search for products and restaurants was introduced with improved features, which positively affects the user engagement. The ETL process using Logstash mirrors, the live database, and the Elasticsearch data will always be slightly relative to the new changes to the database. Also, the ES document structure representing products is completely denormalized to improve fast querying of results. Even though the project was envisioned as a proof of concept, the entire architecture was designed to consider scalability in mind, and is already ready for a production.

Being part of the next big iteration of the product makes us proud, and we are pleased that we have realized and demonstrated the advantages of using AWS Elasticsearch Service as the primary search engine.

About PCG

Public Cloud Group (PCG) supports companies in their digital transformation through the use of public cloud solutions.

With a product portfolio designed to accompany organisations of all sizes in their cloud journey and competence that is a synonym for highly qualified staff that clients and partners like to work with, PCG is positioned as a reliable and trustworthy partner for the hyperscalers, relevant and with repeatedly validated competence and credibility.

We have the highest partnership status with the three relevant hyperscalers: Amazon Web Services (AWS), Google, and Microsoft. As experienced providers, we advise our customers independently with cloud implementation, application development, and managed services.


Services Used

Continue Reading

Article
AWS Lambda: Avoid these common pitfalls

It's a great offering to get results quickly, but like any good tool, it needs to be used correctly.

Learn more
Case Study
Financial Services
Cloud Migration
The VHV Group's Cloud Journey - Strategy for Success

How does an insurance company with more than 4,000 employees balance compliance, modernization, and cost efficiency?

Learn more
Case Study
Financial Services
DevOps
A KYC Archival System for a Digital Bank

Building a KYC archival Cloud platform for a digital bank to store customers’ KYC data.

Learn more
Case Study
Software
DevOps
Accounting Accelerates

What began as a start-up in their parents' basement has developed into a leading provider of cloud-based accounting and financial software within just a few years: sevDesk.

Learn more
See all

Let's work together

United Kingdom
Arrow Down