Voyage X LanceDB
Voyage AI embeddings provide high-precision, domain-specific vector representations across text and multimodal data, while LanceDB offers an ultra-fast, persistent vector database for efficient storage and retrieval. Together, It creates a powerful ecosystem for semantic search, enabling developers to build intelligent, context-aware applications with minimal computational overhead.
This notebook demonstrates a semantic search example where Voyage AI’s text and multimodal embeddings are used to create a powerful search system that integrates both text and images, enabling searches across both image and text data simultaneously.
Install Dependencies
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 32.2/32.2 MB 12.8 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 38.3/38.3 MB 7.1 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 480.6/480.6 kB 15.2 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 116.3/116.3 kB 5.2 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 179.3/179.3 kB 6.3 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 134.8/134.8 kB 5.0 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 194.1/194.1 kB 9.6 MB/s eta 0:00:00 ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. gcsfs 2024.10.0 requires fsspec==2024.10.0, but you have fsspec 2024.9.0 which is incompatible.
Load dataset
This example utilizes Google Research's dataset Conceptual Captions
About Dataset
Dataset Summary Conceptual Captions is a dataset consisting of ~3.3M images annotated with captions. In contrast with the curated style of other image caption annotations, Conceptual Caption images and their raw descriptions are harvested from the web, and therefore represent a wider variety of styles. More precisely, the raw descriptions are harvested from the Alt-text HTML attribute associated with web images. To arrive at the current version of the captions, we have developed an automatic pipeline that extracts, filters, and transforms candidate image/caption pairs, with the goal of achieving a balance of cleanliness, informativeness, fluency, and learnability of the resulting captions.
We'll be loading validation set of this dataset as it contains 15k records, You can try it with train set too, Just make sure you filter all the images and confirm that all the image urls are working(filtering reachable image url is time consuming step)
/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_auth.py:94: UserWarning: The secret `HF_TOKEN` does not exist in your Colab secrets. To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session. You will be able to reuse this secret in all of your notebooks. Please note that authentication is recommended but still optional to access public models or datasets. warnings.warn(
README.md: 0%| | 0.00/14.2k [00:00<?, ?B/s]
train-00000-of-00002.parquet: 0%| | 0.00/187M [00:00<?, ?B/s]
train-00001-of-00002.parquet: 0%| | 0.00/187M [00:00<?, ?B/s]
validation-00000-of-00001.parquet: 0%| | 0.00/1.77M [00:00<?, ?B/s]
Generating train split: 0%| | 0/3318333 [00:00<?, ? examples/s]
Generating validation split: 0%| | 0/15840 [00:00<?, ? examples/s]
Index(['image_url', 'caption'], dtype='object')
Set Voyage API KEY as env variable
Add your Voyage API key as a secret in Google Colab. If you don't have one, you can sign up for one here (with 200M free tokens): https://dash.voyageai.com
Voyage's Text model
Requirement already satisfied: pyarrow in /usr/local/lib/python3.10/dist-packages (17.0.0) Requirement already satisfied: numpy>=1.16.6 in /usr/local/lib/python3.10/dist-packages (from pyarrow) (1.26.4)
Create LanceDB table to index data to do query search
This step may take sometime as data is getting ingested in batches into LanceDB
Query indexed data with Voyage's voyage-3 text model
Caption: cat on a vintage chair in a sunny room
Caption: young cute cat resting on leather sofa .
Caption: cat lying down on the floor with paws up
Caption: mountains soar above villages along the shores
Caption: snow on the distant mountains across the bay
Caption: unique natural landscape the shore .
Voyage's Multimodal model
Create LanceDB table with multimodal model to index data to do query search with either text or image
This step may take sometime as data is getting ingested in batches into LanceDB
Query indexed data with Voyage's voyage-multimodal-3 model using text and image both
Caption: in good company : person poses on the red carpet alongside actors
Caption: celebrity at person on the gala .
Caption: on the red carpet for the premiere of her movie
Now let's query using an image using Voyage's multimodal model
Caption: on the red carpet for the premiere of her movie
Caption: model modeling an evening gown .
Caption: noble person in a glamorous maxi dress during her tour .