Skip to main content
Drop a CSV, Excel, JSON, or Parquet file into your Space — even millions of rows — and it becomes a live dataset you can query in plain language. Zark handles table discovery, query generation, and execution automatically.

Quick Start

import requests

response = requests.post(
    "https://api.zarkai.xyz/v1/chat",
    headers={
        "x-api-key": "your_api_key",
        "Content-Type": "application/json",
    },
    json={
        "messages": [
            {"role": "user", "content": "What's the average order value by region?"}
        ],
        "workspace_id": "wks-xxxxx",
        "tools": ["database"],
        "file_ids": ["file-data123"],
        "tool_choice": "auto",
    },
)

data = response.json()
print(data["response"])
When you upload a data file, Zark examines its structure, identifies columns and data types, cleans up formatting inconsistencies, and stores it in a format optimized for fast analytical queries.

Supported Data Formats

FormatExtensionsNotes
CSV/TSV.csv, .tsvComma or tab-separated values
Excel.xlsx, .xlsMultiple sheets supported — Zark determines how to handle the structure automatically
JSON.jsonStructured or nested data
Parquet.parquetColumnar format common in data engineering

What You Can Ask

Once your data is uploaded, ask analytical questions in plain language:
How many rows are in this data?
Show me the top 10 customers by total spend
Which products had declining sales month over month?
Compare Q3 performance to Q2
What's the revenue breakdown by product category?
Find duplicate entries in the email column
Zark translates your question into a structured query, runs it against your data, and returns results — often with tables, charts, or summaries.

Cross-File Analysis

Upload multiple related files and ask questions that span all of them:
{
  "messages": [
    {"role": "user", "content": "How do our actual Q3 results compare to the projections in the plan?"}
  ],
  "workspace_id": "wks-xxxxx",
  "tools": ["database"],
  "file_ids": ["file-results123", "file-plan456"],
  "tool_choice": "auto"
}
This also works across file types — combining structured data with document content.

Handling Messy Data

Real-world data is rarely perfect. Zark handles common issues automatically:
  • Inconsistent formatting across rows
  • European number formats (commas as decimal separators)
  • Currency symbols mixed into numeric columns
  • Header rows that aren’t in the expected position
  • Missing values and blanks
To check data quality before analysis:
{
  "messages": [
    {"role": "user", "content": "Are there any data quality issues I should know about?"}
  ],
  "workspace_id": "wks-xxxxx",
  "tools": ["database"],
  "file_ids": ["file-data123"],
  "tool_choice": "auto"
}

Streaming

Python
import requests
import json

response = requests.post(
    "https://api.zarkai.xyz/v1/chat",
    headers={"x-api-key": "your_api_key", "Content-Type": "application/json"},
    json={
        "messages": [{"role": "user", "content": "Top 10 customers by revenue"}],
        "workspace_id": "wks-xxxxx",
        "tools": ["database"],
        "file_ids": ["file-data123"],
        "stream": True,
    },
    stream=True,
)

for line in response.iter_lines():
    if line:
        event = json.loads(line.decode("utf-8").removeprefix("data: "))
        if event["type"] == "table":
            print(f"Table: {event['display_name']}{event['total_rows']} rows")
        elif event["type"] == "ai_chunk":
            print(event["content"], end="", flush=True)
        elif event["type"] == "ai_complete":
            print("\n\nDone.")

Streaming Events

EventDescription
tableTable data with rows, columns, and metadata
ai_chunkIncremental text analysis of the results
ai_completeFinal response with analysis

When to Use Database Queries vs Other Tools

Want to…Use
Query, filter, or aggregate dataDatabase Queries
Add/remove columns, merge tables, edit rowsTable Workshop
Run statistical tests or simulationsCode Runner
Read the raw file contentsFile Manager
Database Queries is read-only — it retrieves and analyzes data but does not modify tables.

Examples

Show me all orders over $1,000 from last month
What's the revenue breakdown by product category?
Which customer has the highest lifetime value?
Calculate the month-over-month growth rate
Find duplicate entries in the email column
How do our actual Q3 results compare to the projections?