This website uses cookies. By using the website you agree with our use of cookies. Know more


Powering AI With Vector Databases: A Benchmark - Part II

By Pedro Moreira Costa
Pedro Moreira Costa
2020 Edition Farfetcher with a data enthusiasm that loves to work remotely from his new CASABLANCA in Porto
View All Posts
Powering AI With Vector Databases: A Benchmark - Part II

This is the second part of a 2-part article. You can read about the benchmark here.

Results Analysis

It is paramount to understand how each platform behaves. Once it was clear how to best capture results under the same conditions for all studied engines, we could compare both of them adequately. Here, we will draw some insights from this preliminary study by directly comparing their indexing and querying performance.


The columns below summarise the hardships encountered while implementing and automating indexation for all considered scenarios in each engine Python client.


  • Allows explicit declaration of index parameters while building class schema;
  • Implicit indexation of data in batch;
  • Class naming restrictions we haven't seen in other tools (e.g., Doesn't allow numbers nor special characters ("_","-", etc.))

  • Milvus

  • From the studied engines, it offers the most indexing algorithms and metric types
  • Allows defining index file size for better batch operations
  • Explicitly indexes data in the collection

  • Average indexing analysis is straightforward, with both engines displaying similar trends for each scenario. Milvus does have the edge of completing indexation for all scenarios faster than Weaviate. This is especially evident in the most exhausting scenario, S9, where Milvus was over sixteen times faster. The Average Indexing times bar charts for both engines are displayed below.

    Milvus 1.1.1 Average Indexing Time for Scenarios S1 through S9.

    Weaviate Average Indexing Time for Scenarios S1 through S9.


    There were no major differences between technologies. The Milvus Python client provides a search method that receives a list of vectors to allow multi-vector querying. The Weaviate Python client allows vector search only for one vector.

    As was the case in the Indexing time analysis, both engines display similar behaviour during querying. Note that before explicitly loading the scenario collections to query, in Milvus, we observed a warm-up effect that significantly impacted the average querying time for this system. Even after rectifying, the system is still prone to irregular querying times. Still, it is has a clear advantage over Weaviate, with a shorter average querying time for all considered scenarios. The average querying bar charts for each technology are listed below.

    Milvus 1.1.1 Average Querying Time for Scenarios S1 through S9.

    Weaviate Average Querying Time for Scenarios S1 through S9.


    We set out on this survey to compare popular open-source Vector Similarity Engine (VSE) solutions to facilitate embedding search through approximated nearest high-dimensional vectors. These new frameworks enable a more efficient approach to storing vectorial data. However, they are still relatively recent yet subject to fundamental feature development, such as horizontal scaling, pagination, sharding and GPU support.

    Here at FARFETCH, we are hoping to accommodate a product catalogue from 300k to 5 million products and hidden relations between products, queries and products and other customer impression data. We need the VSE to support a large dataset with several vectorial representations and efficiency for multiple accesses to fulfil our vision. Therefore, it is vital to ensure that we meet the following criteria in this research:

    • The VSE provides high quality (accurate) results;
      • Rendered possible by low-level implementations of performing index types.
    • The VSE indexes the embeddings in a satisfactory timely fashion;
      • Rendered possible by low-level implementations of performing index types.
    • The VSE executes queries at high speed;
      • Rendered possible by low-level implementations of performing index types.
    • The VSE allows horizontal scaling to accommodate load balancing and data replicas to protect against hardware failure while increasing service capacity;
      • Sharding.
    • The VSE provides a top K results collection, but the system that integrates it can iterate over the top K results by a window of size N.
      • Pagination.

    This experiment focused chiefly on two engines: Milvus and Weaviate. Quality analysis of the retrieved results was not included in the context of this study, as it would require an additional configuration exploring:

    • the used embedding model(s) to encode information;
    • the index type in use.


    Given that the experimental setup depended on a similar configuration throughout the studied engines, we fixed the indexing algorithm on HNSW. For the time being, please refer to this blog article to understand the impact of the indexing algorithm on search speed, quality and memory.

    From the analysis on the indexation and querying times, Milvus consistently outperformed Weaviate, emphasising the indexing time for scenario S9, closely resembling the FARFETCH product catalogue's dimensions.

    These technologies did impose technical limitations, such as a lack of support for multiple encodings (effectively warranting a change to adapt scenarios S4 to S9). However, in
     [1], multi-vector querying refers to a query search based on a list of vectors and the indexation of entities with more than one representation. Additional information on the tools' roadmap shows that these (and other) features are to be expected in the next stable release.


    [1] Milvus: A Purpose-Built Vector Data Management System



    This work was partially funded by the iFetch project, Ref. 45920, co-financed by ERDF, COMPETE 2020, NORTE 2020 and FCT under CMU Portugal.

    Related Articles