Documentation Index
Fetch the complete documentation index at: https://mintlify.com/spiceai/spiceai/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Spice is a SQL query, search, and LLM-inference engine written in Rust for data apps and agents. It provides a lightweight, portable runtime (single binary/container) that combines data query and AI inference in a unified engine.System Architecture
Core Components
Runtime Daemon (spiced)
The Spice runtime is a long-running daemon process that:- Manages data connections and accelerations
- Executes SQL queries via Apache DataFusion
- Serves multiple API endpoints (HTTP, Arrow Flight, ODBC, JDBC)
- Performs AI/LLM inference with OpenAI-compatible APIs
- Provides search capabilities (vector, keyword, full-text)
CLI (spice)
The command-line interface for:- Initializing Spicepods (
spice init) - Managing the runtime (
spice run) - Querying data (
spice sql) - Configuring datasets and models
Industry-Standard APIs
Spice provides four primary API surfaces:1. SQL Query & Search APIs
- HTTP/REST: JSON query endpoint
- Arrow Flight: High-performance binary protocol
- Arrow Flight SQL: SQL over Arrow Flight
- ODBC/JDBC/ADBC: Standard database connectivity
- UDTFs:
vector_search()andtext_search()table functions
2. OpenAI-Compatible APIs
- Chat completions endpoint (
/v1/chat/completions) - Embeddings endpoint (
/v1/embeddings) - Model listing (
/v1/models) - Compatible with OpenAI SDK
3. Iceberg Catalog REST API
Unified catalog interface for:- Table metadata
- Schema evolution
- Time travel queries
4. MCP (Model Context Protocol)
HTTP + Server-Sent Events (SSE) integration for:- Tool/function calling
- External integrations
- Agent workflows
Design Principles
1. Data Correctness is Non-Negotiable
As an AI-native database and search engine, data correctness is the foundational principle. Every query must return correct results. Incorrect data is unacceptable under any circumstances.- Correctness supersedes performance, developer experience, and feature velocity
- Slow correct answers are infinitely better than fast wrong ones
- NULL handling, type coercions, and boundary conditions must be precise
- Errors are surfaced visibly rather than silently corrupting data
2. Secure by Default
Spice defaults to secure configurations:- TLS/SSL encrypted connections to remote sources
- Optional insecure mode when needed
- No credentials stored in plain text
3. Developer Experience First
The goal is to make creating intelligent applications as easy as possible:- Simple YAML configuration (Spicepod)
- Single binary deployment
- Standard SQL interface
- Extensive connector ecosystem
4. Bring Data and AI to Your Application
Instead of sending data to another service:- Co-locate data with applications
- Run at the application level (1:1 or 1:N mapping)
- Deploy as sidecar, microservice, or cluster
- Enable AI feedback loops locally
5. API First
All functionality is available through HTTP APIs on the runtime.6. Composable from Community Components
Spice projects consist of:- Datasets from various sources
- Models from HuggingFace, OpenAI, local files
- Community-built components via Spicerack registry
7. Extensibility First-Class
All components use well-defined interfaces:- Data Connectors
- Data Accelerators
- Catalog Connectors
- Secret Stores
- Models and Embeddings
Application-Focused Deployment
Traditional Databases vs. Spice
| Traditional Databases | Spice |
|---|---|
| Many apps → One database | One app → One Spice instance |
| Centralized, shared | Distributed, app-level |
| Shared schema | App-specific schema |
| Network latency | Local/co-located |
Deployment Patterns
Sidecar Pattern (Common) Each application runs its own Spice instance for:- Zero network latency
- Isolated acceleration
- Independent scaling
- Per-tenant deployments
Technology Stack
Spice is built on industry-leading open-source technologies:- Apache DataFusion: Query planning and execution engine
- Apache Arrow: Columnar memory format and compute kernels
- Arrow Flight: High-performance data transport
- DuckDB: Embedded OLAP engine for acceleration
- SQLite: Embedded OLTP engine for acceleration
- Vortex: Compressed columnar format (Cayenne accelerator)
- Tantivy: Full-text search engine (Rust)
- Rust: Systems programming language for performance and safety
Disaggregated Storage Architecture
Spice separates compute from storage:Key Benefits
- Co-locate working sets with applications for low latency
- Access source data in original storage without migration
- Independent scaling of compute and storage
- Fast cold starts via acceleration snapshots from S3
- Ephemeral compute with persistent recovery
Separate Tokio Runtimes
Spice uses isolated async runtimes for different concerns:- HTTP Server Runtime: Health checks, API endpoints (must stay responsive)
- Query Processing Runtime: DataFusion planning and execution (CPU/IO intensive)
Next Steps
Data Federation
Query across databases, warehouses, and lakes
Data Acceleration
Materialize and cache data locally
Search
Vector, keyword, and full-text search
AI Inference
LLM inference and embeddings