Main
agentsmultimodal_video_searchllmsvector-databaselancedbgptopenaiAImultimodal-aimachine-learningembeddingsfine-tuningexamplesdeep-learninggpt-4-visionllama-indexragmultimodallangchainlancedb-recipes
Export
Multimodal video search using CLIP and LanceDB
We used LanceDB to store frames every thirty seconds and the title of 13000+ videos, 5 random from each top category from the Youtube 8M dataset. Then, we used the CLIP model to embed frames and titles together. With LanceDB, we can perform embedding, keyword, and SQL search on these videosjpg)
Install dependencies
[4]
β¬ Downloading https://github.com/jaimergp/miniforge/releases/download/24.11.2-1_colab/Miniforge3-colab-24.11.2-1_colab-Linux-x86_64.sh...
π¦ Installing...
π Adjusting configuration...
π©Ή Patching environment...
β² Done in 0:00:10
π Restarting kernel...
Channels:
- conda-forge
Platform: linux-64
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... done
## Package Plan ##
environment location: /usr/local/envs/py310
added / updated specs:
- python=3.10
The following packages will be downloaded:
package | build
---------------------------|-----------------
bzip2-1.0.8 | hda65f42_8 254 KB conda-forge
ca-certificates-2025.8.3 | hbd8a1cb_0 151 KB conda-forge
ld_impl_linux-64-2.44 | h1423503_1 660 KB conda-forge
libexpat-2.7.1 | hecca717_0 73 KB conda-forge
libffi-3.4.6 | h2dba641_1 56 KB conda-forge
libgcc-15.1.0 | h767d61c_5 805 KB conda-forge
libgcc-ng-15.1.0 | h69a702a_5 29 KB conda-forge
libgomp-15.1.0 | h767d61c_5 437 KB conda-forge
liblzma-5.8.1 | hb9d3cd8_2 110 KB conda-forge
libnsl-2.0.1 | hb9d3cd8_1 33 KB conda-forge
libsqlite-3.50.4 | h0c1763c_0 911 KB conda-forge
libuuid-2.41.1 | he9a06e4_0 36 KB conda-forge
ncurses-6.5 | h2d0b736_3 871 KB conda-forge
openssl-3.5.3 | h26f9b46_0 3.0 MB conda-forge
pip-25.2 | pyh8b19718_0 1.1 MB conda-forge
python-3.10.18 |hd6af730_0_cpython 23.9 MB conda-forge
readline-8.2 | h8c095d6_2 276 KB conda-forge
setuptools-80.9.0 | pyhff2d567_0 731 KB conda-forge
tk-8.6.13 |noxft_hd72426e_102 3.1 MB conda-forge
tzdata-2025b | h78e105d_0 120 KB conda-forge
------------------------------------------------------------
Total: 36.6 MB
The following NEW packages will be INSTALLED:
_libgcc_mutex conda-forge/linux-64::_libgcc_mutex-0.1-conda_forge
_openmp_mutex conda-forge/linux-64::_openmp_mutex-4.5-2_gnu
bzip2 conda-forge/linux-64::bzip2-1.0.8-hda65f42_8
ca-certificates conda-forge/noarch::ca-certificates-2025.8.3-hbd8a1cb_0
ld_impl_linux-64 conda-forge/linux-64::ld_impl_linux-64-2.44-h1423503_1
libexpat conda-forge/linux-64::libexpat-2.7.1-hecca717_0
libffi conda-forge/linux-64::libffi-3.4.6-h2dba641_1
libgcc conda-forge/linux-64::libgcc-15.1.0-h767d61c_5
libgcc-ng conda-forge/linux-64::libgcc-ng-15.1.0-h69a702a_5
libgomp conda-forge/linux-64::libgomp-15.1.0-h767d61c_5
liblzma conda-forge/linux-64::liblzma-5.8.1-hb9d3cd8_2
libnsl conda-forge/linux-64::libnsl-2.0.1-hb9d3cd8_1
libsqlite conda-forge/linux-64::libsqlite-3.50.4-h0c1763c_0
libuuid conda-forge/linux-64::libuuid-2.41.1-he9a06e4_0
libxcrypt conda-forge/linux-64::libxcrypt-4.4.36-hd590300_1
libzlib conda-forge/linux-64::libzlib-1.3.1-hb9d3cd8_2
ncurses conda-forge/linux-64::ncurses-6.5-h2d0b736_3
openssl conda-forge/linux-64::openssl-3.5.3-h26f9b46_0
pip conda-forge/noarch::pip-25.2-pyh8b19718_0
python conda-forge/linux-64::python-3.10.18-hd6af730_0_cpython
readline conda-forge/linux-64::readline-8.2-h8c095d6_2
setuptools conda-forge/noarch::setuptools-80.9.0-pyhff2d567_0
tk conda-forge/linux-64::tk-8.6.13-noxft_hd72426e_102
tzdata conda-forge/noarch::tzdata-2025b-h78e105d_0
wheel conda-forge/noarch::wheel-0.45.1-pyhd8ed1ab_1
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
[5]
Collecting tantivy==0.20.1 Downloading tantivy-0.20.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.3 kB) Downloading tantivy-0.20.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.1 MB) ββββββββββββββββββββββββββββββββββββββββ 4.1/4.1 MB 53.7 MB/s 0:00:00 Installing collected packages: tantivy Successfully installed tantivy-0.20.1
First run setup: Download data and pre-process
[2]
[3]
--2025-09-18 18:26:25-- https://vectordb-recipes.s3.us-west-2.amazonaws.com/multimodal_video_lance.tar.gz Resolving vectordb-recipes.s3.us-west-2.amazonaws.com (vectordb-recipes.s3.us-west-2.amazonaws.com)... 3.5.77.185, 52.92.241.66, 3.5.78.17, ... Connecting to vectordb-recipes.s3.us-west-2.amazonaws.com (vectordb-recipes.s3.us-west-2.amazonaws.com)|3.5.77.185|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 239974228 (229M) [application/x-gzip] Saving to: βmultimodal_video_lance.tar.gz.1β multimodal_video_la 100%[===================>] 228.86M 10.3MB/s in 16s 2025-09-18 18:26:41 (14.5 MB/s) - βmultimodal_video_lance.tar.gz.1β saved [239974228/239974228] multimodal_video.lance/ multimodal_video.lance/.DS_Store multimodal_video.lance/_versions/ multimodal_video.lance/_latest.manifest multimodal_video.lance/data/ multimodal_video.lance/_indices/ multimodal_video.lance/_indices/tantivy/ multimodal_video.lance/_indices/tantivy/9b278a8cf1c14b7f8c4c92a40bb75163.store multimodal_video.lance/_indices/tantivy/e0a7ab7a679242bb93b9e451252bb0ab.idx multimodal_video.lance/_indices/tantivy/a0fc72fda7d24f8b9042ea17b111d00e.term multimodal_video.lance/_indices/tantivy/2d0c8f633d1e4bdf8dfebb72a577da01.pos multimodal_video.lance/_indices/tantivy/d6777856977d44c0a7be9249e27ecbf9.fieldnorm multimodal_video.lance/_indices/tantivy/2130bc17dd0f4563a9578658b4ca8725.term multimodal_video.lance/_indices/tantivy/9b278a8cf1c14b7f8c4c92a40bb75163.fast multimodal_video.lance/_indices/tantivy/34bc7d3e86b34d78b50bde2ed0d78142.fieldnorm multimodal_video.lance/_indices/tantivy/9b278a8cf1c14b7f8c4c92a40bb75163.fieldnorm multimodal_video.lance/_indices/tantivy/e0a7ab7a679242bb93b9e451252bb0ab.fieldnorm multimodal_video.lance/_indices/tantivy/a0fc72fda7d24f8b9042ea17b111d00e.fast multimodal_video.lance/_indices/tantivy/2130bc17dd0f4563a9578658b4ca8725.fast multimodal_video.lance/_indices/tantivy/5ae00c69b7a54e2eb11b771cd589bdef.store multimodal_video.lance/_indices/tantivy/9b278a8cf1c14b7f8c4c92a40bb75163.term multimodal_video.lance/_indices/tantivy/ec2a0fec9b5644d0a1688909ae4ab401.idx multimodal_video.lance/_indices/tantivy/2130bc17dd0f4563a9578658b4ca8725.pos multimodal_video.lance/_indices/tantivy/.tantivy-meta.lock multimodal_video.lance/_indices/tantivy/2d0c8f633d1e4bdf8dfebb72a577da01.fieldnorm multimodal_video.lance/_indices/tantivy/a0fc72fda7d24f8b9042ea17b111d00e.fieldnorm multimodal_video.lance/_indices/tantivy/5325ec58020041caa3445d73d265e7fb.pos multimodal_video.lance/_indices/tantivy/a0fc72fda7d24f8b9042ea17b111d00e.pos multimodal_video.lance/_indices/tantivy/d6777856977d44c0a7be9249e27ecbf9.pos multimodal_video.lance/_indices/tantivy/2130bc17dd0f4563a9578658b4ca8725.fieldnorm multimodal_video.lance/_indices/tantivy/3683f0cf34f94bfba076d75045148da3.term multimodal_video.lance/_indices/tantivy/e0a7ab7a679242bb93b9e451252bb0ab.store multimodal_video.lance/_indices/tantivy/34bc7d3e86b34d78b50bde2ed0d78142.pos multimodal_video.lance/_indices/tantivy/9b278a8cf1c14b7f8c4c92a40bb75163.pos multimodal_video.lance/_indices/tantivy/5ae00c69b7a54e2eb11b771cd589bdef.fast multimodal_video.lance/_indices/tantivy/a6c6c3c429b041f59827f5ba80b82c8e.pos multimodal_video.lance/_indices/tantivy/3683f0cf34f94bfba076d75045148da3.fieldnorm multimodal_video.lance/_indices/tantivy/3683f0cf34f94bfba076d75045148da3.idx multimodal_video.lance/_indices/tantivy/.tantivy-writer.lock multimodal_video.lance/_indices/tantivy/5ae00c69b7a54e2eb11b771cd589bdef.term multimodal_video.lance/_indices/tantivy/3683f0cf34f94bfba076d75045148da3.store multimodal_video.lance/_indices/tantivy/a6c6c3c429b041f59827f5ba80b82c8e.fieldnorm multimodal_video.lance/_indices/tantivy/5ae00c69b7a54e2eb11b771cd589bdef.pos multimodal_video.lance/_indices/tantivy/af7082fa8060442095731134023712ef.idx multimodal_video.lance/_indices/tantivy/5325ec58020041caa3445d73d265e7fb.fieldnorm multimodal_video.lance/_indices/tantivy/3683f0cf34f94bfba076d75045148da3.fast multimodal_video.lance/_indices/tantivy/d6777856977d44c0a7be9249e27ecbf9.fast multimodal_video.lance/_indices/tantivy/2d0c8f633d1e4bdf8dfebb72a577da01.fast multimodal_video.lance/_indices/tantivy/a6c6c3c429b041f59827f5ba80b82c8e.store multimodal_video.lance/_indices/tantivy/2d0c8f633d1e4bdf8dfebb72a577da01.store multimodal_video.lance/_indices/tantivy/e0a7ab7a679242bb93b9e451252bb0ab.fast multimodal_video.lance/_indices/tantivy/e0a7ab7a679242bb93b9e451252bb0ab.pos multimodal_video.lance/_indices/tantivy/af7082fa8060442095731134023712ef.fast multimodal_video.lance/_indices/tantivy/5ae00c69b7a54e2eb11b771cd589bdef.fieldnorm multimodal_video.lance/_indices/tantivy/5325ec58020041caa3445d73d265e7fb.fast multimodal_video.lance/_indices/tantivy/2d0c8f633d1e4bdf8dfebb72a577da01.idx multimodal_video.lance/_indices/tantivy/5325ec58020041caa3445d73d265e7fb.store multimodal_video.lance/_indices/tantivy/af7082fa8060442095731134023712ef.term multimodal_video.lance/_indices/tantivy/ec2a0fec9b5644d0a1688909ae4ab401.pos multimodal_video.lance/_indices/tantivy/2130bc17dd0f4563a9578658b4ca8725.idx multimodal_video.lance/_indices/tantivy/5325ec58020041caa3445d73d265e7fb.term multimodal_video.lance/_indices/tantivy/d6777856977d44c0a7be9249e27ecbf9.store multimodal_video.lance/_indices/tantivy/5325ec58020041caa3445d73d265e7fb.idx multimodal_video.lance/_indices/tantivy/d6777856977d44c0a7be9249e27ecbf9.term multimodal_video.lance/_indices/tantivy/2d0c8f633d1e4bdf8dfebb72a577da01.term multimodal_video.lance/_indices/tantivy/ec2a0fec9b5644d0a1688909ae4ab401.fieldnorm multimodal_video.lance/_indices/tantivy/e0a7ab7a679242bb93b9e451252bb0ab.term multimodal_video.lance/_indices/tantivy/34bc7d3e86b34d78b50bde2ed0d78142.store multimodal_video.lance/_indices/tantivy/34bc7d3e86b34d78b50bde2ed0d78142.fast multimodal_video.lance/_indices/tantivy/d6777856977d44c0a7be9249e27ecbf9.idx multimodal_video.lance/_indices/tantivy/2130bc17dd0f4563a9578658b4ca8725.store multimodal_video.lance/_indices/tantivy/ec2a0fec9b5644d0a1688909ae4ab401.store multimodal_video.lance/_indices/tantivy/a0fc72fda7d24f8b9042ea17b111d00e.idx multimodal_video.lance/_indices/tantivy/af7082fa8060442095731134023712ef.store multimodal_video.lance/_indices/tantivy/a0fc72fda7d24f8b9042ea17b111d00e.store multimodal_video.lance/_indices/tantivy/a6c6c3c429b041f59827f5ba80b82c8e.idx multimodal_video.lance/_indices/tantivy/af7082fa8060442095731134023712ef.fieldnorm multimodal_video.lance/_indices/tantivy/ec2a0fec9b5644d0a1688909ae4ab401.term multimodal_video.lance/_indices/tantivy/9b278a8cf1c14b7f8c4c92a40bb75163.idx multimodal_video.lance/_indices/tantivy/34bc7d3e86b34d78b50bde2ed0d78142.idx multimodal_video.lance/_indices/tantivy/.managed.json multimodal_video.lance/_indices/tantivy/a6c6c3c429b041f59827f5ba80b82c8e.fast multimodal_video.lance/_indices/tantivy/ec2a0fec9b5644d0a1688909ae4ab401.fast multimodal_video.lance/_indices/tantivy/3683f0cf34f94bfba076d75045148da3.pos multimodal_video.lance/_indices/tantivy/a6c6c3c429b041f59827f5ba80b82c8e.term multimodal_video.lance/_indices/tantivy/34bc7d3e86b34d78b50bde2ed0d78142.term multimodal_video.lance/_indices/tantivy/af7082fa8060442095731134023712ef.pos multimodal_video.lance/_indices/tantivy/5ae00c69b7a54e2eb11b771cd589bdef.idx multimodal_video.lance/_indices/tantivy/meta.json multimodal_video.lance/data/43a4100e-575d-4bdd-9c47-df8acb14a577.lance multimodal_video.lance/_versions/1.manifest mv: cannot move 'multimodal_video.lance' to 'data/video-lancedb/multimodal_video.lance': Directory not empty
Create / Open LanceDB Table
[6]
Create CLIP embedding function for the text
CLIP model Architecuture.
[7]
/usr/local/lib/python3.12/dist-packages/huggingface_hub/utils/_auth.py:94: UserWarning: The secret `HF_TOKEN` does not exist in your Colab secrets. To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session. You will be able to reuse this secret in all of your notebooks. Please note that authentication is recommended but still optional to access public models or datasets. warnings.warn(
tokenizer_config.json: 0%| | 0.00/592 [00:00<?, ?B/s]
vocab.json: 0.00B [00:00, ?B/s]
merges.txt: 0.00B [00:00, ?B/s]
tokenizer.json: 0.00B [00:00, ?B/s]
special_tokens_map.json: 0%| | 0.00/389 [00:00<?, ?B/s]
config.json: 0.00B [00:00, ?B/s]
pytorch_model.bin: 0%| | 0.00/605M [00:00<?, ?B/s]
model.safetensors: 0%| | 0.00/605M [00:00<?, ?B/s]
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Fetching 1 files: 0%| | 0/1 [00:00<?, ?it/s]
preprocessor_config.json: 0%| | 0.00/316 [00:00<?, ?B/s]
Search functions for Gradio
[8]
[9]
Setup Gradio interface
[10]
It looks like you are running Gradio on a hosted Jupyter notebook, which requires `share=True`. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly). Colab notebook detected. To show errors in colab notebook, set debug=True in launch() * Running on public URL: https://9e77f0361545e16880.gradio.live This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)
[ ]