Universal Query Demo
Demo: Universal Query for Hybrid Retrieval
In this hands-on demo, we'll build a research paper discovery system using real arXiv data and Qdrant's Universal Query API. You'll see how to:
- Fetch real research papers from arXiv
- Generate dense, sparse, and ColBERT embeddings
- Execute hybrid retrieval with intelligent filtering
- Combine multiple search strategies in a single query
Setup & Dependencies
Install required packages for working with Qdrant, embeddings, and arXiv.
[notice] A new release of pip is available: 23.3.1 -> 25.2 [notice] To update, run: pip install --upgrade pip
Step 1: Create the Collection
Configure our collection with three vector types for multi-stage retrieval:
- Dense vectors (384-dim) for semantic understanding
- Sparse vectors for exact keyword matching
- ColBERT multivectors (128-dim) for fine-grained reranking
✓ Collection 'research-papers' created
Create Payload Indexes
Create indexes for the fields we'll filter by. Qdrant applies filters at the HNSW search level, not as post-processing.
✓ Payload indexes created
Step 2: Initialize Embedding Models
Load FastEmbed models for generating all three embedding types.
Loading embedding models... ✓ All embedding models loaded
Step 3: Fetch Papers from arXiv
Let's search arXiv for papers about transformers and multimodal learning.
Fetching papers from arXiv...
Step 4: Process and Ingest Papers
For each paper, we'll:
- Extract metadata (title, authors, abstract, date)
- Generate dense, sparse, and ColBERT embeddings
- Upload to Qdrant
Processing paper 1... Processing paper 11... Processing paper 21... Processing paper 31... Processing paper 41... ✓ Processed 50 papers
Since we use FastEmbed, we could also create the points with alternative syntax using so-called local inference. Here's how that would look:
point = models.PointStruct(
id=i,
payload={
... # same as above
},
vector={
"dense": model.Document(
model=DENSE_MODEL_ID,
text=abstract,
),
"sparse": model.Document(
model=SPARSE_MODEL_ID,
text=abstract,
),
"colbert": model.Document(
model=COLBERT_MODEL_ID,
text=abstract,
),
},
)
FastEmbed would then handle embedding generation within Qdrant during upload. This syntax is also compatible with Cloud Inference if you prefer to offload embedding generation to Qdrant Cloud.
✓ Uploaded 50 research papers to Qdrant
Step 5: Execute the Universal Query
Now we'll search for papers using hybrid retrieval that combines:
- Parallel dense and sparse search
- Reciprocal Rank Fusion (RRF)
- ColBERT reranking
- Global filtering (applied at every stage)
Define Global Filter
This filter will automatically propagate to all prefetch stages.
Build and Execute Multi-Stage Query
✓ Query executed successfully Found 10 results
Results
Let's examine the top papers discovered by our hybrid retrieval system.
==================================================================================================== TOP RESEARCH PAPERS ==================================================================================================== 1. Collaborative Text-to-Image Generation via Multi-Agent Reinforcement Learning and Semantic Fusion Authors: Jiabao Shi, Minfeng Qi, Lefeng Zhang... Published: 2025-10-12 Research Area: computer_vision Relevance Score: 25.1145 arXiv: http://arxiv.org/abs/2510.10633v1 Abstract: Multimodal text-to-image generation remains constrained by the difficulty of maintaining semantic alignment and professional-level detail across diverse visual domains. We propose a multi-agent reinfo... 2. Beyond Appearance: Transformer-based Person Identification from Conversational Dynamics Authors: Masoumeh Chapariniya, Teodora Vukovic, Sarah Ebling... Published: 2025-10-06 Research Area: machine_learning Relevance Score: 24.5158 arXiv: http://arxiv.org/abs/2510.04753v1 Abstract: This paper investigates the performance of transformer-based architectures for person identification in natural, face-to-face conversation scenario. We implement and evaluate a two-stream framework th... 3. CAT: Curvature-Adaptive Transformers for Geometry-Aware Learning Authors: Ryan Y. Lin, Siddhartha Ojha, Nicholas Bai Published: 2025-10-02 Research Area: computer_vision Relevance Score: 22.4639 arXiv: http://arxiv.org/abs/2510.01634v1 Abstract: Transformers achieve strong performance across diverse domains but implicitly assume Euclidean geometry in their attention mechanisms, limiting their effectiveness on data with non-Euclidean structure... 4. BitMar: Low-Bit Multimodal Fusion with Episodic Memory for Edge Devices Authors: Euhid Aman, Esteban Carlin, Hsing-Kuo Pao... Published: 2025-10-12 Research Area: computer_vision Relevance Score: 22.2900 arXiv: http://arxiv.org/abs/2510.10560v1 Abstract: Cross-attention transformers and other multimodal vision-language models excel at grounding and generation; however, their extensive, full-precision backbones make it challenging to deploy them on edg... 5. Complementary and Contrastive Learning for Audio-Visual Segmentation Authors: Sitong Gong, Yunzhi Zhuge, Lu Zhang... Published: 2025-10-11 Research Area: computer_vision Relevance Score: 22.2407 arXiv: http://arxiv.org/abs/2510.10051v1 Abstract: Audio-Visual Segmentation (AVS) aims to generate pixel-wise segmentation maps that correlate with the auditory signals of objects. This field has seen significant progress with numerous CNN and Transf... 6. BioAutoML-NAS: An End-to-End AutoML Framework for Multimodal Insect Classification via Neural Architecture Search on Large-Scale Biodiversity Data Authors: Arefin Ittesafun Abian, Debopom Sutradhar, Md Rafi Ur Rashid... Published: 2025-10-07 Research Area: computer_vision Relevance Score: 20.8009 arXiv: http://arxiv.org/abs/2510.05888v1 Abstract: Insect classification is important for agricultural management and ecological research, as it directly affects crop health and production. However, this task remains challenging due to the complex cha... 7. Towards fairer public transit: Real-time tensor-based multimodal fare evasion and fraud detection Authors: Peter Wauyo, Dalia Bwiza, Alain Murara... Published: 2025-10-02 Research Area: computer_vision Relevance Score: 20.6940 arXiv: http://arxiv.org/abs/2510.02165v1 Abstract: This research introduces a multimodal system designed to detect fraud and fare evasion in public transportation by analyzing closed circuit television (CCTV) and audio data. The proposed solution uses... 8. Provable Speech Attributes Conversion via Latent Independence Authors: Jonathan Svirsky, Ofir Lindenbaum, Uri Shaham Published: 2025-10-06 Research Area: computer_vision Relevance Score: 20.4618 arXiv: http://arxiv.org/abs/2510.05191v2 Abstract: While signal conversion and disentangled representation learning have shown promise for manipulating data attributes across domains such as audio, image, and multimodal generation, existing approaches... 9. A Spatial-Spectral-Frequency Interactive Network for Multimodal Remote Sensing Classification Authors: Hao Liu, Yunhao Gao, Wei Li... Published: 2025-10-06 Research Area: computer_vision Relevance Score: 20.4072 arXiv: http://arxiv.org/abs/2510.04628v1 Abstract: Deep learning-based methods have achieved significant success in remote sensing Earth observation data analysis. Numerous feature fusion techniques address multimodal remote sensing image classificati... 10. Growing Visual Generative Capacity for Pre-Trained MLLMs Authors: Hanyu Wang, Jiaming Han, Ziyan Yang... Published: 2025-10-02 Research Area: computer_vision Relevance Score: 20.3669 arXiv: http://arxiv.org/abs/2510.01546v1 Abstract: Multimodal large language models (MLLMs) extend the success of language models to visual understanding, and recent efforts have sought to build unified MLLMs that support both understanding and genera...