Notebooks
M
Milvus
Langchain Milvus Async

Langchain Milvus Async

image-searchvector-databasesemantic-searchIntegrationmilvusembeddingsunstructured-dataquestion-answeringLLMmilvus-bootcampdeep-learningimage-recognitionimage-classificationaudio-searchPythonraglangchainNLP

Asynchronous Functions in LangChain Milvus Integration

This tutorial explores how to leverage asynchronous functions in langchain-milvus to build high-performance applications. By using async methods, you can significantly improve your application's throughput and responsiveness, especially when dealing with large-scale retrieval. Whether you're building a real-time recommendation system, implementing semantic search in your application, or creating a RAG (Retrieval-Augmented Generation) pipeline, async operations can help you handle concurrent requests more efficiently. The high-performance vector database Milvus combined with LangChain's powerful LLM abstractions can provide a robust foundation for building scalable AI applications.

Async API Overview

langchain-milvus provides comprehensive asynchronous operation support, significantly improving performance in large-scale concurrent scenarios. The async API maintains consistent interface design with sync API.

Core Async Functions

To use async operations in langchain-milvus, simply add an a prefix to method names. This allows for better resource utilization and improved throughput when handling concurrent retrieval requests.

Operation TypeSync MethodAsync MethodDescription
Add Textsadd_texts()aadd_texts()Add texts to vector store
Add Documentsadd_documents()aadd_documents()Add documents to vector store
Add Embeddingsadd_embeddings()aadd_embeddings()Add embedding vectors
Similarity Searchsimilarity_search()asimilarity_search()Semantic search by text
Vector Searchsimilarity_search_by_vector()asimilarity_search_by_vector()Semantic search by vector
Search with Scoresimilarity_search_with_score()asimilarity_search_with_score()Semantic search by text and return similarity scores
Vector Search with Scoresimilarity_search_with_score_by_vector()asimilarity_search_with_score_by_vector()Semantic search by vector and return similarity scores
Diversity Searchmax_marginal_relevance_search()amax_marginal_relevance_search()MMR search (return similar ones while also optimizing for diversity)
Vector Diversity Searchmax_marginal_relevance_search_by_vector()amax_marginal_relevance_search_by_vector()MMR search by vector
Delete Operationdelete()adelete()Delete documents
Upsert Operationupsert()aupsert()Upsert (update if existing, otherwise insert) documents
Metadata Searchsearch_by_metadata()asearch_by_metadata()Query with metadata filtering
Get Primary Keysget_pks()aget_pks()Get primary keys by expression
Create from Textsfrom_texts()afrom_texts()Create vector store from texts

For more detailed information about these functions, please refer to the API Reference.

Performance Benefits

Async operations provide significant performance improvements when handling large volumes of concurrent requests, particularly suitable for:

  • Batch document processing
  • High-concurrency search scenarios
  • Production RAG applications
  • Large-scale data import/export

In this tutorial, we'll demonstrate these performance benefits through detailed comparisons of synchronous and asynchronous operations, showing you how to leverage async APIs for optimal performance in your applications.

Before you begin

Code snippets on this page require the following dependencies:

[1]

If you are using Google Colab, to enable dependencies just installed, you may need to restart the runtime (click on the "Runtime" menu at the top of the screen, and select "Restart session" from the dropdown menu).

We will use OpenAI models. You should prepare the api key OPENAI_API_KEY as an environment variable:

[2]

If you are using Jupyter Notebook, you need to run this line of code before running the asynchronous code:

[3]

Exploring Async APIs and Performance Comparison

Now let's dive deeper into the performance comparison between synchronous and asynchronous operations with langchain-milvus.

First, import the necessary libraries:

[4]

Setting up Test Functions

Let's create helper functions to generate test data:

[5]

Initialize the Vector Store

Before we can run our performance tests, we need to set up a clean Milvus vector store. This function ensures we start with a fresh collection for each test, eliminating any interference from previous data:

[6]

Async vs Sync: Add Documents

Now let's compare the performance of synchronous vs asynchronous document addition. These functions will help us measure how much faster async operations can be when adding multiple documents to the vector store. The async version creates tasks for each document addition and runs them concurrently, while the sync version processes documents one by one:

[7]

Now let's execute our performance tests with different document counts to see the real-world performance differences. We'll test with varying loads to understand how async operations scale compared to their synchronous counterparts. The tests will measure execution time for both approaches and help demonstrate the performance benefits of asynchronous operations:

[8]
2025-06-05 10:44:12,274 [DEBUG][_create_connection]: Created new connection using: dd5f77bb78964c079da42c2446b03bf6 (async_milvus_client.py:599)
Async add for 10 documents took 1.74 seconds
2025-06-05 10:44:16,940 [DEBUG][_create_connection]: Created new connection using: 8b13404a78654cdd9b790371eb44e427 (async_milvus_client.py:599)
Async add for 100 documents took 2.77 seconds
Sync add for 10 documents took 5.36 seconds
Sync add for 100 documents took 65.60 seconds

Async vs Sync: Search

For the search performance comparison, we'll need to populate the vector store first. The following functions will help us measure search performance by creating multiple concurrent search queries and comparing the execution time between synchronous and asynchronous approaches:

[9]

Now let's run comprehensive search performance tests to see how async operations scale compared to synchronous ones. We'll test with different query volumes to demonstrate the performance benefits of asynchronous operations, especially as the number of concurrent operations increases:

[10]
2025-06-05 10:45:28,131 [DEBUG][_create_connection]: Created new connection using: 851824591c64415baac843e676e78cdd (async_milvus_client.py:599)
Async search for 10 queries took 2.31 seconds
Async search for 100 queries took 3.72 seconds
Sync search for 10 queries took 6.07 seconds
Sync search for 100 queries took 54.22 seconds

Async vs Sync: Delete

Delete operations are another critical aspect where async operations can provide significant performance improvements. Let's create functions to measure the performance difference between synchronous and asynchronous delete operations. These tests will help demonstrate how async operations can handle batch deletions more efficiently:

[11]

Now let's execute the delete performance tests to quantify the performance difference. We'll start with a fresh vector store populated with test data, then perform delete operations using both synchronous and asynchronous approaches:

[12]
2025-06-05 10:46:57,211 [DEBUG][_create_connection]: Created new connection using: 504e9ce3be92411e87077971c82baca2 (async_milvus_client.py:599)
Async delete for 10 operations took 0.58 seconds
2025-06-05 10:47:12,309 [DEBUG][_create_connection]: Created new connection using: 22c1513b444e4c40936e2176d7a1a154 (async_milvus_client.py:599)
Async delete for 100 operations took 0.61 seconds
Sync delete for 10 operations took 2.82 seconds
Sync delete for 100 operations took 29.21 seconds

Conclusion

This tutorial demonstrated the significant performance advantages of using asynchronous operations with LangChain and Milvus. We compared the synchronous and asynchronous versions of add, search, and delete operations, showing how async operations can provide substantial speed improvements, especially for large batch operations.

Key takeaways:

  1. Async operations deliver the most benefit when performing many individual operations that can run in parallel
  2. For workload that generates higher throughput, the performance gap between sync and async operations widens
  3. Async operations fully utilize the compute power of the machines

When building production RAG applications with LangChain and Milvus, consider using the async API when performance is a concern, especially for concurrent operations.