Natural Language to SQL (NSQL)

Spice’s NSQL (Natural Language to SQL) API converts natural language questions into SQL queries and optionally executes them against your data. This enables users to query data using plain English without writing SQL.

Endpoint

POST /v1/nsql

Authentication

Include your Spice API key in the request headers:

Authorization: Bearer <your-api-key>

Request Parameters

Parameter	Type	Required	Default	Description
`query`	string	Yes	-	The natural language query to convert to SQL
`model`	string	No	`"nql"`	The model to use for SQL generation
`stream`	boolean	No	`false`	Whether to stream the response as Server-Sent Events
`sample_data_enabled`	boolean	No	`true`	Include sample data in the context for better SQL generation
`datasets`	array	No	All datasets	Specific datasets to sample from (does not restrict query targets)

Request Body

{
  "query": "Get the top 5 customers by total sales",
  "model": "nql",
  "stream": false,
  "sample_data_enabled": true,
  "datasets": ["sales_data"]
}

Response Formats

The response format depends on the Accept header:

application/json (default) - Query results as JSON array
application/vnd.spiceai.nsql.v1+json - Extended response with schema and SQL
application/sql - Only the generated SQL query (does not execute)
text/csv - Results in CSV format
text/plain - Plain text format
text/event-stream - Streaming response (when stream: true)

Standard JSON Response (application/json)

[
  {
    "customer_id": "12345",
    "total_sales": 150000
  },
  {
    "customer_id": "67890",
    "total_sales": 125000
  }
]

Extended Response (application/vnd.spiceai.nsql.v1+json)

{
  "row_count": 2,
  "schema": {
    "fields": [
      {
        "name": "customer_id",
        "data_type": "String",
        "nullable": false,
        "dict_id": 0,
        "dict_is_ordered": false
      },
      {
        "name": "total_sales",
        "data_type": "Int64",
        "nullable": false,
        "dict_id": 0,
        "dict_is_ordered": false
      }
    ]
  },
  "data": [
    {
      "customer_id": "12345",
      "total_sales": 150000
    },
    {
      "customer_id": "67890",
      "total_sales": 125000
    }
  ],
  "sql": "SELECT customer_id, SUM(total_sales) AS total_sales\nFROM sales_data\nGROUP BY customer_id\nORDER BY total_sales DESC\nLIMIT 5"
}

SQL Only Response (application/sql)

SELECT customer_id, SUM(total_sales) AS total_sales
FROM sales_data
GROUP BY customer_id
ORDER BY total_sales DESC
LIMIT 5

Examples

Execute Query and Get Results

cURL

curl -X POST http://localhost:8090/v1/nsql \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <your-api-key>" \
  -d '{
    "query": "Show me the top 10 products by revenue this month",
    "model": "nql",
    "sample_data_enabled": true
  }'

Python

import requests

url = "http://localhost:8090/v1/nsql"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer <your-api-key>"
}
payload = {
    "query": "What are the average order values by region?",
    "model": "nql",
    "sample_data_enabled": True,
    "datasets": ["orders", "customers"]
}

response = requests.post(url, json=payload, headers=headers)
results = response.json()

for row in results:
    print(row)

Node.js

const axios = require('axios');

const url = 'http://localhost:8090/v1/nsql';
const headers = {
  'Content-Type': 'application/json',
  'Authorization': 'Bearer <your-api-key>'
};
const data = {
  query: 'List customers who made purchases in the last 30 days',
  model: 'nql',
  sample_data_enabled: true
};

axios.post(url, data, { headers })
  .then(response => {
    console.log('Results:', response.data);
  })
  .catch(error => {
    console.error('Error:', error.response.data);
  });

Get SQL Without Executing

cURL

curl -X POST http://localhost:8090/v1/nsql \
  -H "Content-Type: application/json" \
  -H "Accept: application/sql" \
  -H "Authorization: Bearer <your-api-key>" \
  -d '{
    "query": "Find the total revenue by product category"
  }'

Python

import requests

url = "http://localhost:8090/v1/nsql"
headers = {
    "Content-Type": "application/json",
    "Accept": "application/sql",
    "Authorization": "Bearer <your-api-key>"
}
payload = {
    "query": "Which products have low inventory?"
}

response = requests.post(url, json=payload, headers=headers)
sql_query = response.text

print("Generated SQL:")
print(sql_query)

Get Extended Response with Schema

cURL

curl -X POST http://localhost:8090/v1/nsql \
  -H "Content-Type: application/json" \
  -H "Accept: application/vnd.spiceai.nsql.v1+json" \
  -H "Authorization: Bearer <your-api-key>" \
  -d '{
    "query": "Show sales trends over the last quarter"
  }'

Python

import requests

url = "http://localhost:8090/v1/nsql"
headers = {
    "Content-Type": "application/json",
    "Accept": "application/vnd.spiceai.nsql.v1+json",
    "Authorization": "Bearer <your-api-key>"
}
payload = {
    "query": "What is the customer churn rate by segment?"
}

response = requests.post(url, json=payload, headers=headers)
result = response.json()

print(f"Row count: {result['row_count']}")
print(f"SQL: {result['sql']}")
print(f"Data: {result['data']}")

Streaming Response

Python

import requests
import json

url = "http://localhost:8090/v1/nsql"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer <your-api-key>"
}
payload = {
    "query": "Show all active users",
    "stream": True
}

response = requests.post(url, json=payload, headers=headers, stream=True)

for line in response.iter_lines():
    if line:
        # Remove 'data: ' prefix
        data_str = line.decode('utf-8').replace('data: ', '')
        try:
            data = json.loads(data_str)
            print(data)
        except json.JSONDecodeError:
            pass

How NSQL Works

Schema Analysis: The model receives table schemas for available datasets
Sample Data: If enabled, sample data is included to understand data patterns
SQL Generation: The LLM generates SQL based on the natural language query
Validation: The generated SQL is cleaned and validated
Execution: The SQL is executed against your data (unless SQL-only mode)
Retry Logic: If execution fails, the model retries with error context (up to 10 attempts)
Results: Query results are returned in the requested format

Sample Data Context

When sample_data_enabled: true, the model receives:

Distinct column values: Up to 3 distinct values per column (limited columns)
Random samples: 3 random rows from each dataset

This helps the model:

Understand data types and formats
Generate accurate filters and conditions
Use correct column names and values

Dataset Filtering

The datasets parameter is a sampling hint, not a query restriction:

{
  "query": "Show customer purchases",
  "datasets": ["customers", "orders"]
}

This tells NSQL to sample from customers and orders tables, but the generated query can still reference other tables if needed.

Model Configuration

NSQL models must be configured in your Spicepod with dataset access:

models:
  - name: nql
    from: openai:gpt-4
    datasets:
      - sales_data
      - customers
      - products

The datasets field restricts which tables the model can query. If empty, all tables are accessible.

Error Responses

Model Not Found (400)

Model custom_model not found

Dataset Not Found (400)

Dataset 'unknown_table' not found

No Query Generated (500)

No query produced from NSQL model

Query Execution Error (400)

If the query fails after retries, the error from the last attempt is returned:

Error executing query: column 'invalid_column' does not exist

Best Practices

Enable sample data: Improves SQL generation accuracy for complex queries
Specify datasets: Limit context to relevant tables for better performance
Use descriptive queries: “Show top 10 customers by revenue in 2024” is better than “top customers”
Configure model datasets: Restrict table access in the model configuration for security
Handle errors: NSQL may fail if the query is ambiguous or data doesn’t match expectations
Review generated SQL: Use Accept: application/sql to review queries before execution
Start simple: Test with simple queries before attempting complex aggregations

Advanced: Custom NQL Models

You can configure custom models for NSQL:

models:
  - name: custom_nql
    from: openai:gpt-4-turbo
    datasets:
      - specific_dataset
    params:
      temperature: 0.1
      max_tokens: 500

Lower temperature values (0.0-0.3) typically work better for SQL generation tasks.

Streaming for Long-Running Queries

Use streaming for queries that may take time:

curl -X POST http://localhost:8090/v1/nsql \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <your-api-key>" \
  -N \
  -d '{
    "query": "Analyze all transactions for anomalies",
    "stream": true
  }'

Streaming sends Server-Sent Events as the query executes, preventing timeouts and providing progress feedback.

Retry Mechanism

NSQL automatically retries failed queries up to 10 times, incorporating error feedback:

Generate SQL from natural language
Execute the SQL
If execution fails, provide the error to the model
Model generates corrected SQL
Repeat until success or max retries

This allows the model to self-correct common issues like:

Incorrect column names
Missing table aliases
Invalid aggregations
Type mismatches

Query APIs

AI APIs

HTTP APIs

Integration

Natural Language to SQL (NSQL)

Endpoint

Authentication

Request Parameters

Request Body

Response Formats

Standard JSON Response (application/json)

Extended Response (application/vnd.spiceai.nsql.v1+json)

SQL Only Response (application/sql)

Examples

Execute Query and Get Results

cURL

Python

Node.js

Get SQL Without Executing

cURL

Python

Get Extended Response with Schema

cURL

Python

Streaming Response

Python

How NSQL Works

Sample Data Context

Dataset Filtering

Model Configuration

Error Responses

Model Not Found (400)

Dataset Not Found (400)

No Query Generated (500)

Query Execution Error (400)

Best Practices

Advanced: Custom NQL Models

Streaming for Long-Running Queries

Retry Mechanism

Query APIs

AI APIs

HTTP APIs

Integration

Documentation Index

​Endpoint

​Authentication

​Request Parameters

​Request Body

​Response Formats

​Standard JSON Response (application/json)

​Extended Response (application/vnd.spiceai.nsql.v1+json)

​SQL Only Response (application/sql)

​Examples

​Execute Query and Get Results

​cURL

​Python

​Node.js

​Get SQL Without Executing

​cURL

​Python

​Get Extended Response with Schema

​cURL

​Python

​Streaming Response

​Python

​How NSQL Works

​Sample Data Context

​Dataset Filtering

​Model Configuration

​Error Responses

​Model Not Found (400)

​Dataset Not Found (400)

​No Query Generated (500)

​Query Execution Error (400)

​Best Practices

​Advanced: Custom NQL Models

​Streaming for Long-Running Queries

​Retry Mechanism

Endpoint

Authentication

Request Parameters

Request Body

Response Formats

Standard JSON Response (application/json)

Extended Response (application/vnd.spiceai.nsql.v1+json)

SQL Only Response (application/sql)

Examples

Execute Query and Get Results

cURL

Python

Node.js

Get SQL Without Executing

cURL

Python

Get Extended Response with Schema

cURL

Python

Streaming Response

Python

How NSQL Works

Sample Data Context

Dataset Filtering

Model Configuration

Error Responses

Model Not Found (400)

Dataset Not Found (400)

No Query Generated (500)

Query Execution Error (400)

Best Practices

Advanced: Custom NQL Models

Streaming for Long-Running Queries

Retry Mechanism