Documentation Index
Fetch the complete documentation index at: https://mintlify.com/spiceai/spiceai/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The Search API enables vector similarity search (VSS) and hybrid text search across datasets. It returns the most relevant matches based on cosine similarity with the input text, using embedding models configured in your runtime.Search Endpoint
Request Headers
Must be
application/jsonOptional cache key for client-specific caching. When provided, responses will include
Vary: Spice-Cache-Key header for CDN caching.Request Body
List of dataset names to search. Datasets must have an embedding column and appropriate embedding model loaded.
The search query text. This will be embedded and used for similarity matching.
SQL WHERE clause to filter results (e.g.,
user=1234321, created_at > '2024-01-01')Additional columns to include in the response data (e.g.,
["timestamp", "user_id"])Maximum number of results to return. Must be greater than 0.
Keywords for hybrid search (combines vector similarity with keyword matching)
Request Example
Response
Array of matching results sorted by relevance score (highest first).
Object containing matched column values (fields that triggered the match)
Name of the dataset this result came from
Primary key values identifying this record
Additional column data requested via
additional_columnsRelevance score (0-1), where higher values indicate better matches. Based on cosine similarity.
Total search execution time in milliseconds
Response Headers
Cache status for the search results:
hit- Results served from cachemiss- Results computed and cachedbypass- Cache bypassed
Set to
Spice-Cache-Key when client cache key is provided, enabling CDN caching per user.Response Example
Status Codes
- 200 OK - Search completed successfully
- 400 Bad Request - Invalid request parameters or dataset not configured for search
- 500 Internal Server Error - Unexpected error during search
Error Responses
No Datasets Provided (400)
Invalid Limit (400)
Dataset Not Configured for Search (400)
Internal Server Error (500)
Examples
Basic Vector Search
Search with Filters
Hybrid Search with Keywords
Search with Additional Columns
Multi-Dataset Search
Search with Cache Key
Use Cases
Semantic Document Search
Search through large document collections using natural language queries:Customer Support Ticket Search
Find similar support tickets to route or resolve issues faster:E-commerce Product Search
Find products using natural language descriptions:RAG (Retrieval Augmented Generation)
Retrieve relevant context for LLM prompts:Prerequisites
Before using the Search API:- Configure embedding columns in your datasets
- Load embedding models (e.g.,
text-embedding-ada-002,all-MiniLM-L6-v2) - Enable acceleration for better search performance (recommended)
Performance Considerations
- Acceleration: Enable dataset acceleration for significantly faster search
- Limit: Use appropriate limits to balance relevance vs. response time
- Caching: Leverage cache keys for frequently repeated searches
- Filters: Use
whereclauses to reduce search space - Batch Processing: For multiple searches, consider parallel requests
Related APIs
- HTTP Query API - Execute SQL queries on search results
- Models API - List available embedding models
- Datasets API - View dataset configuration and status