Milvus Multimodal Demo May30

Multimodal Demo May30

image-searchvector-databasesemantic-searchmilvusWorkshopsembeddingsunstructured-dataquestion-answeringLLMmilvus-bootcampdeep-learningimage-recognitionimage-classificationaudio-searchPythonbootcampragmultimodalNLP

alph-notebooks/milvus-bootcamp / multimodal_demo_may30.ipynb

Export

Run Notebooks

Contents

No cells yet

Add cells to see them here

Download data

Data is 25K .jpg images from two existing datasets.

images.csv metadata from Unsplash, sorted and converted to CSV.
images/ in 250x250 resolution by kaggle/@jettchentt.
images.fbin is a binary file with UForm image embeddings.
images.usearch is a binary file with a serialized USearch index. The original images.tsv from Unsplash has been filtered to avoid missing images.

👉🏼 Download images.zip file directly from:
https://huggingface.co/datasets/unum-cloud/ann-unsplash-25k/tree/main

[1]

(24292, 31)

Create a Milvus Collection

[2]

Pymilvus: 2.4.3
Milvus server: v2.4.1

[3]

Successfully dropped collection: `Demo_multimodal`
Successfully created collection: `Demo_multimodal`

Inference the embedding model

Using Unum's UForm Pocket-Sized Multimodal Encoders.

Supports text-to-image queries in 21 languages including: English German Spanish French Italian Russian Japanese Korean Turkish Chinese Polish.

[4]

[5]

2024-05-28 20:22:01.881705 [W:onnxruntime:, helper.cc:67 IsInputSupported] CoreML does not support input dim > 16384. Input:word_embeddings.weight_quantized, shape: {250037,384}
2024-05-28 20:22:01.882543 [W:onnxruntime:, coreml_execution_provider.cc:81 GetCapability] CoreMLExecutionProvider::GetCapability, number of partitions supported by CoreML: 74 number of nodes in the graph: 714 number of nodes supported by CoreML: 483
2024-05-28 20:22:03.051915 [W:onnxruntime:, coreml_execution_provider.cc:81 GetCapability] CoreMLExecutionProvider::GetCapability, number of partitions supported by CoreML: 100 number of nodes in the graph: 1056 number of nodes supported by CoreML: 727

[6]

Context leak detected, msgtracer returned -1
Context leak detected, msgtracer returned -1
Context leak detected, msgtracer returned -1
Context leak detected, msgtracer returned -1

Embedding time for batch size 10: 5.01 seconds
Embedding time for batch size 10: 4.25 seconds
Embedding time for batch size 10: 5.03 seconds
Embedding time for batch size 10: 4.65 seconds
Embedding time for batch size 10: 4.88 seconds
Embedding time for batch size 10: 5.41 seconds
Embedding time for batch size 10: 5.39 seconds
Embedding time for batch size 10: 4.47 seconds
Embedding time for batch size 10: 2.64 seconds
Image error: ./images/_bQFVR3DF68.jpg
Embedding time for batch size 9: 4.45 seconds
Embedding time for batch size 10: 5.61 seconds
Embedding time for batch size 10: 5.26 seconds
Embedding time for batch size 10: 5.29 seconds
Embedding time for batch size 10: 4.98 seconds
Image error: ./images/_GZBJppR7Hk.jpg
Embedding time for batch size 9: 3.66 seconds
Embedding time for batch size 10: 4.44 seconds
Embedding time for batch size 10: 4.2 seconds
Embedding time for batch size 10: 3.33 seconds
Embedding time for batch size 10: 4.53 seconds
Embedding time for batch size 10: 4.35 seconds
Embedding time for batch size 10: 3.41 seconds
Embedding time for batch size 10: 4.11 seconds
Embedding time for batch size 10: 4.47 seconds
Embedding time for batch size 10: 3.81 seconds
Embedding time for batch size 10: 5.43 seconds
Embedding time for batch size 10: 5.0 seconds
Embedding time for batch size 10: 4.51 seconds
Embedding time for batch size 10: 5.38 seconds
Embedding time for batch size 10: 4.19 seconds
Embedding time for batch size 10: 2.47 seconds

[7]

output fields: ['id', 'chunk', 'image_filepath']

[8]

black and white nike athletic shoe

green leafed plant

red rose flowers

a white dog sitting in the snow looking at the camera

cloudy sky during golden hour

selective focus photography of waterfalls during daytime

a close up of a cat with an open mouth

bird's-eye photography of pine trees covered by snow

grey mountains during sunset

woman with wings willow tree figurine

Now the fun part, search!

[9]

Count rows: 298
timing: 0.0066 seconds

[10]

[11]

a close up of a cat with an open mouth

[17]

Milvus search time: 0.0044231414794921875 seconds

[16]

Milvus search time: 0.0057070255279541016 seconds

[19]

silhouette of person sitting on rock formation during golden hour

[21]

Milvus search time: 0.004724025726318359 seconds

[27]

<Figure size 640x480 with 0 Axes>

$Output$

[15]

Author: Christy Bergman

Python implementation: CPython
Python version       : 3.11.8
IPython version      : 8.22.2

torch   : 2.3.0
pymilvus: 2.4.3
uform   : 3.0.2

conda environment: py311-unum