Image Description Prompting Pixtral
Image Description Extraction using Mistral's Pixtral API
Image Description Extraction using Mistral's Pixtral API
In this notebook, we'll use the Mistral API to extract structured image descriptions in JSON format using the Pixtral-12b-2409 model. We'll send an image URL and prompt the model to return key elements with descriptions.
Prerequisites
Make sure you have an API key for the Mistral AI platform. We'll also show you how to load it from environment variables.
Collecting mistralai Downloading mistralai-1.9.11-py3-none-any.whl.metadata (39 kB) Collecting eval-type-backport>=0.2.0 (from mistralai) Downloading eval_type_backport-0.2.2-py3-none-any.whl.metadata (2.2 kB) Requirement already satisfied: httpx>=0.28.1 in /usr/local/lib/python3.12/dist-packages (from mistralai) (0.28.1) Collecting invoke<3.0.0,>=2.2.0 (from mistralai) Downloading invoke-2.2.1-py3-none-any.whl.metadata (3.3 kB) Requirement already satisfied: pydantic>=2.10.3 in /usr/local/lib/python3.12/dist-packages (from mistralai) (2.11.10) Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.12/dist-packages (from mistralai) (2.9.0.post0) Requirement already satisfied: pyyaml<7.0.0,>=6.0.2 in /usr/local/lib/python3.12/dist-packages (from mistralai) (6.0.3) Requirement already satisfied: typing-inspection>=0.4.0 in /usr/local/lib/python3.12/dist-packages (from mistralai) (0.4.2) Requirement already satisfied: anyio in /usr/local/lib/python3.12/dist-packages (from httpx>=0.28.1->mistralai) (4.11.0) Requirement already satisfied: certifi in /usr/local/lib/python3.12/dist-packages (from httpx>=0.28.1->mistralai) (2025.10.5) Requirement already satisfied: httpcore==1.* in /usr/local/lib/python3.12/dist-packages (from httpx>=0.28.1->mistralai) (1.0.9) Requirement already satisfied: idna in /usr/local/lib/python3.12/dist-packages (from httpx>=0.28.1->mistralai) (3.11) Requirement already satisfied: h11>=0.16 in /usr/local/lib/python3.12/dist-packages (from httpcore==1.*->httpx>=0.28.1->mistralai) (0.16.0) Requirement already satisfied: annotated-types>=0.6.0 in /usr/local/lib/python3.12/dist-packages (from pydantic>=2.10.3->mistralai) (0.7.0) Requirement already satisfied: pydantic-core==2.33.2 in /usr/local/lib/python3.12/dist-packages (from pydantic>=2.10.3->mistralai) (2.33.2) Requirement already satisfied: typing-extensions>=4.12.2 in /usr/local/lib/python3.12/dist-packages (from pydantic>=2.10.3->mistralai) (4.15.0) Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.12/dist-packages (from python-dateutil>=2.8.2->mistralai) (1.17.0) Requirement already satisfied: sniffio>=1.1 in /usr/local/lib/python3.12/dist-packages (from anyio->httpx>=0.28.1->mistralai) (1.3.1) Downloading mistralai-1.9.11-py3-none-any.whl (442 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 442.8/442.8 kB 23.0 MB/s eta 0:00:00 Downloading eval_type_backport-0.2.2-py3-none-any.whl (5.8 kB) Downloading invoke-2.2.1-py3-none-any.whl (160 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 160.3/160.3 kB 12.3 MB/s eta 0:00:00 Installing collected packages: invoke, eval-type-backport, mistralai Successfully installed eval-type-backport-0.2.2 invoke-2.2.1 mistralai-1.9.11
Setup
We'll load the Mistral API key from environment variables and initialize the client. Make sure your API key is saved in your environment variables as MISTRAL_API_KEY.
env: MISTRAL_API_KEY=
Sending Image URL for Description
We'll prompt the model to describe the image by providing an image URL. The response will be returned in a structured JSON format with the key elements described.
{"elements": [{"element": "Eiffel Tower", "description": "The Eiffel Tower is a wrought-iron lattice tower on the Champ de Mars in Paris, France. It is named after the engineer Gustave Eiffel, whose company designed and built the tower."}]}
Parsing the JSON Response
We'll now parse the JSON response from the API and print the elements and their corresponding descriptions.
Element: Eiffel Tower Description: The Eiffel Tower is a wrought-iron lattice tower on the Champ de Mars in Paris, France. It is named after the engineer Gustave Eiffel, whose company designed and built the tower.
Conclusion
In this notebook, we used the Mistral Pixtral model to describe an image by sending an image URL and receiving a structured JSON response. The descriptions provided by the model offer insights into the key elements of the image.