Skip to content

Commit 0ed91a3

Browse files
add new multimodal guide, clarify search fragment interpolation
1 parent f1562b2 commit 0ed91a3

File tree

3 files changed

+247
-10
lines changed

3 files changed

+247
-10
lines changed

docs.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -167,6 +167,7 @@
167167
"learn/ai_powered_search/getting_started_with_ai_search",
168168
"learn/ai_powered_search/configure_rest_embedder",
169169
"learn/ai_powered_search/document_template_best_practices",
170+
"learn/ai_powered_search/image_search_with_multimodal_embeddings",
170171
"learn/ai_powered_search/image_search_with_user_provided_embeddings",
171172
"learn/ai_powered_search/search_with_user_provided_embeddings",
172173
"learn/ai_powered_search/retrieve_related_search_results",
Lines changed: 224 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,224 @@
1+
---
2+
title: Image search with multimodal embeddings
3+
description: This article shows you the main steps for performing multimodal text-to-image searches
4+
---
5+
6+
This guide shows the main steps to search through a database of images using Meilisearch's experimental multimodal embeddings.
7+
8+
## Requirements
9+
10+
- A database of images
11+
- A Meilisearch project
12+
- Access to a multimodal embedding provider (for example, [VoyageAI multimodal embeddings](https://docs.voyageai.com/reference/multimodal-embeddings-api))
13+
14+
## Enable multimodal embeddings
15+
16+
First, enable the `multimodal` experimental feature:
17+
18+
```sh
19+
curl \
20+
-X PATCH 'MEILISEARCH_URL/experimental-features/' \
21+
-H 'Content-Type: application/json' \
22+
--data-binary '{
23+
"multimodal": true
24+
}'
25+
```
26+
27+
You may also enable multimodal in your Meilisearch Cloud project's general settings, under "Experimental features".
28+
29+
## Configure a multimodal embedder
30+
31+
Much like other embedders, multimodal embedders must set their `source` to `rest`. Depending on your chosen provider, you may also have to specify `url`, `model`, `apiKey`.
32+
33+
All multimodal embedders must contain an `indexingFragments` field and a `searchFragments` field. Fragments are sets of embeddings built out of specific parts of document data.
34+
35+
Fragments must follow the structure defined by your chosen model and provider.
36+
37+
### `indexingFragments`
38+
39+
Use `indexingFragments` to tell Meilisearch what data to use when generating document embeddings.
40+
41+
For example, when using VoyageAI's multimodal model, an indexing fragment might look like this:
42+
43+
```json
44+
"indexingFragments": {
45+
"TEXTUAL_FRAGMENT_NAME": {
46+
"value": {
47+
"content": [
48+
{
49+
"type": "text",
50+
"text": "A document named {{doc.title}} described as {{doc.description}}"
51+
}
52+
]
53+
}
54+
},
55+
"IMAGE_FRAGMENT_NAME": {
56+
"value": {
57+
"content": [
58+
{
59+
"type": "image_url",
60+
"image_url": "{{doc.poster_url}}"
61+
}
62+
]
63+
}
64+
}
65+
}
66+
```
67+
68+
The example above requests Meilisearch to create two sets of embeddings during indexing: one for the textual description of an image, and another for the actual image.
69+
70+
Each fragment must have one field with a Liquid template where you interpolate document data present in `doc`. In `IMAGE_FRAGMENT_NAME`, that's `image_url` which outputs the plain URL string in the document field `poster_url`. In `TEXT_FRAGMENT_NAME`, `text` contains a longer string contextualizing two document fields, `title` and `description`.
71+
72+
### `searchFragments`
73+
74+
Use `searchFragments` to tell Meilisearch what data to use when converting user queries into embeddings:
75+
76+
```json
77+
"searchFragments": {
78+
"USER_TEXT_FRAGMENT": {
79+
"value": {
80+
"content": [
81+
{
82+
"type": "text",
83+
"text": "{{q}}"
84+
}
85+
]
86+
}
87+
},
88+
"USER_SUBMITTED_IMAGE_FRAGMENT": {
89+
"value": {
90+
"content": [
91+
{
92+
"type": "image_base64",
93+
"image_base64": "data:{{media.image.mime}};base64,{{media.image.data}}"
94+
}
95+
]
96+
}
97+
}
98+
}
99+
```
100+
101+
This configuration tells Meilisearch how to submit query data for vectorization.
102+
103+
Search fragments have access to data present in the query parameters `media` and `q`.
104+
105+
### Complete embedder configuration
106+
107+
Your embedder should look similar to this example with all fragments and embedding provider data:
108+
109+
```sh
110+
curl \
111+
-X PATCH 'MEILISEARCH_URL/indexes/INDEX_NAME/settings' \
112+
-H 'Content-Type: application/json' \
113+
--data-binary '{
114+
"embedders": {
115+
"MULTIMODAL_EMBEDDER_NAME": {
116+
"source": "rest",
117+
"url": "https://api.voyageai.com/v1/multimodal-embeddings",
118+
"apiKey": "VOYAGE_API_KEY",
119+
"indexingFragments": {
120+
"TEXTUAL_FRAGMENT_NAME": {
121+
"value": {
122+
"content": [
123+
{
124+
"type": "text",
125+
"text": "A document named {{doc.title}} described as {{doc.description}}"
126+
}
127+
]
128+
}
129+
},
130+
"IMAGE_FRAGMENT_NAME": {
131+
"value": {
132+
"content": [
133+
{
134+
"type": "image_url",
135+
"image_url": "{{doc.poster_url}}"
136+
}
137+
]
138+
}
139+
}
140+
},
141+
"searchFragments": {
142+
"USER_TEXT_FRAGMENT": {
143+
"value": {
144+
"content": [
145+
{
146+
"type": "text",
147+
"text": "{{q}}"
148+
}
149+
]
150+
}
151+
},
152+
"USER_SUBMITTED_IMAGE_FRAGMENT": {
153+
"value": {
154+
"content": [
155+
{
156+
"type": "image_base64",
157+
"image_base64": "data:{{media.image.mime}};base64,{{media.image.data}}"
158+
}
159+
]
160+
}
161+
}
162+
}
163+
}
164+
}
165+
}'
166+
```
167+
168+
## Add documents
169+
170+
Once your embedder is configured, you can [add documents to your index](/learn/getting_started/cloud_quick_start) with the [`/documents` endpoint](/reference/api/documents).
171+
172+
During indexing, Meilisearch will automatically generate multimodal embeddings for each document using the configured `indexingFragments`.
173+
174+
## Perform searches
175+
176+
The final step is to perform searches using different types of content.
177+
178+
### Use text to search for images
179+
180+
Use the following search query to retrieve a mix of documents with images matching the description, documents with and documents containing the specified keywords:
181+
182+
```sh
183+
curl -X POST 'http://localhost:7700/indexes/INDEX_NAME/search' \
184+
-H 'Content-Type: application/json' \
185+
--data-binary '{
186+
"q": "a mountain sunset with snow",
187+
"hybrid": {
188+
"embedder": "MULTIMODAL_EMBEDDER_NAME"
189+
}
190+
}'
191+
```
192+
193+
### Use an image to search for images
194+
195+
You can also use an image to search for other, similar images:
196+
197+
```sh
198+
curl -X POST 'http://localhost:7700/indexes/INDEX_NAME/search' \
199+
-H 'Content-Type: application/json' \
200+
--data-binary '{
201+
"media": {
202+
"image": {
203+
"mime": "image/jpeg",
204+
"data": "<BASE64_ENCODED_IMAGE>"
205+
}
206+
},
207+
"hybrid": {
208+
"embedder": "MULTIMODAL_EMBEDDER_NAME"
209+
}
210+
}'
211+
```
212+
213+
<Tip>
214+
In most cases you will need a GUI interface that allows users to submit their images and converts these images to Base64 format. Creating this is outside the scope of this guide.
215+
</Tip>
216+
217+
## Conclusion
218+
219+
With multimodal embedders you can:
220+
221+
1. Configure Meilisearch to vectorize both images and queries
222+
2. Add image documents — Meilisearch automatically generates embeddings
223+
3. Accept text or image input from users
224+
4. Run hybrid searches using a mix of textual and input from other types of media, or run pure semantic semantic searches using only non-textual input

reference/api/settings.mdx

Lines changed: 22 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -2974,19 +2974,31 @@ curl \
29742974

29752975
As with `indexingFragments`, the content of `value` should follow your model's specification.
29762976

2977-
Use Liquid templates to interpolate search query data into the fragment fields, where `media` gives you access to all multimodal data received with a query:
2977+
Use Liquid templates to interpolate search query data into the fragment fields, where `{{media.*}}` gives you access to all [multimodal data received with a query](/reference/api/search#media) and `{{q}}` gives you access to the regular textual query:
29782978

29792979
```json
2980-
"SEARCH_FRAGMENT_A": {
2981-
"value": {
2982-
"content": [
2983-
{
2984-
"type": "image_base64",
2985-
"image_base64": "data:{{media.image.mime}};base64,{{media.image.data}}"
2986-
}
2987-
]
2980+
{
2981+
"SEARCH_FRAGMENT_A": {
2982+
"value": {
2983+
"content": [
2984+
{
2985+
"type": "image_base64",
2986+
"image_base64": "data:{{media.image.mime}};base64,{{media.image.data}}"
2987+
}
2988+
]
2989+
}
2990+
},
2991+
"SEARCH_FRAGMENT_B": {
2992+
"value": {
2993+
"content": [
2994+
{
2995+
"type": "text",
2996+
"text": "{{q}}"
2997+
}
2998+
]
2999+
}
29883000
}
2989-
},
3001+
}
29903002
```
29913003

29923004
`searchFragments` is optional when using the `rest` source.

0 commit comments

Comments
 (0)