Skip to main content
Media Processor is the heavy-duty transcription and indexing pipeline for audio and video files. It runs compute-intensive processing jobs — speech-to-text transcription, visual scene analysis, key moment extraction, and segment indexing.

Quick Start

import requests

response = requests.post(
    "https://api.zarkai.xyz/v1/chat",
    headers={
        "x-api-key": "your_api_key",
        "Content-Type": "application/json",
    },
    json={
        "messages": [
            {"role": "user", "content": "Process this video — full transcription and visual analysis"}
        ],
        "workspace_id": "wks-xxxxx",
        "tools": ["media_processor"],
        "file_ids": ["file-vid123"],
        "tool_choice": "auto",
    },
)

data = response.json()
print(data["response"])

When to Use Media Processor vs File Manager

Want to…Use
Ask what’s in a videoFile Manager
List or search your videosFile Manager
Start transcription on a videoMedia Processor
Check indexing progressMedia Processor
Process a specific time rangeMedia Processor
Upgrade transcript with visual descriptionsMedia Processor
File Manager handles viewing and questions. Media Processor handles processing pipelines.

Operations

OperationExample prompt
Full processing”Process this video”
Transcription only”Transcribe this audio file”
Check status”How much of this video is indexed?”
Resume processing”Continue processing from where you left off”
Time range”Process from 15:00 to 30:00”
Visual upgrade”Upgrade this file with visual scene descriptions”
Re-analyze”Re-analyze this video focusing on the product demos”

Partial Processing

Process a specific segment of a long video:
{
  "messages": [
    {"role": "user", "content": "Process from 15:00 to 30:00"}
  ],
  "workspace_id": "wks-xxxxx",
  "tools": ["media_processor"],
  "file_ids": ["file-vid123"],
  "tool_choice": "auto"
}

Check Progress

{
  "messages": [
    {"role": "user", "content": "How much of this video has been indexed?"}
  ],
  "workspace_id": "wks-xxxxx",
  "tools": ["media_processor"],
  "file_ids": ["file-vid123"],
  "tool_choice": "auto"
}

What Gets Extracted

Data typeDescription
TranscriptsEvery spoken word with timestamps. Speaker attribution when possible
Key momentsDecisions, action items, key statements, highlights, introductions, conclusions
Topics & entitiesPeople, companies, products, and themes identified and indexed
Visual contextSlides, on-screen text, charts, diagrams, product demos, scene changes (video only)
Once processed, all content becomes searchable through File Manager.

Supported Formats

TypeFormats
VideoMP4, MOV, AVI, MKV, WebM
AudioMP3, WAV, M4A, FLAC, OGG

Streaming

Python
import requests
import json

response = requests.post(
    "https://api.zarkai.xyz/v1/chat",
    headers={"x-api-key": "your_api_key", "Content-Type": "application/json"},
    json={
        "messages": [{"role": "user", "content": "Process this video"}],
        "workspace_id": "wks-xxxxx",
        "tools": ["media_processor"],
        "file_ids": ["file-vid123"],
        "stream": True,
    },
    stream=True,
)

for line in response.iter_lines():
    if line:
        event = json.loads(line.decode("utf-8").removeprefix("data: "))
        if event["type"] == "ai_chunk":
            print(event["content"], end="", flush=True)
        elif event["type"] == "ai_complete":
            print("\n\nDone.")

Streaming Events

EventDescription
ai_chunkProcessing summary and extracted content
ai_completeFinal response with processing results

Examples

Process this video
Transcribe this audio file
How much of this video is indexed?
Continue processing from where you left off
Process from 15:00 to 30:00
Upgrade this file with visual descriptions