Supported Document Types
Text-based documents are read and indexed so you can ask questions about their content. Zark AI supports the following document formats:| Format | File Extensions | Extraction Means |
|---|---|---|
.pdf | Handles both text-based and scanned PDFs. Uses optical character recognition (OCR) for image-based documents. Extracts and makes tables queryable. Multi-page documents processed completely with page numbers preserved. | |
| Microsoft Word | .docx, .doc | Modern (.docx) and legacy (.doc) formats. Full text extraction with formatting preserved. |
| PowerPoint | .pptx, .ppt | Extracts text from all slides, including speaker notes, text boxes, and content within shapes. Tables in presentations are captured as structured data. |
| Plain Text | .txt | Simple text files with full content extraction and indexing. |
| Markdown | .md | Markdown-formatted documents with structure and formatting preserved. |
| HTML | .html, .htm | Web pages and HTML documents. Extracts text content and preserves structure. |
| Rich Text Format | .rtf | RTF documents with formatting and structure maintained. |
What You Can Ask
Once a document is processed, you might ask:- “What are the main points in this contract?”
- “Find any mentions of pricing or payment terms”
- “Summarize the executive summary”
- “What action items were mentioned in these meeting notes?”
- “Does this policy document mention remote work?”
Ask “What does this contract say about liability limitations?” and Zark AI finds the relevant sections even if the word “liability” isn’t explicitly used—it understands the concept you’re asking about.
Cross-Document Analysis
Zark AI can compare and synthesize information across multiple text-based documents, focusing on meaning, language, and intent rather than raw data. This capability also works with data files, allowing you to combine structured data with document content.Cross-document analysis focuses on understanding meaning, consistency, and intent across multiple text-based documents.
- Identify inconsistencies or conflicts
- Compare terms, clauses, or language
- Extract shared themes or differences across documents such as contracts, proposals, reports, or policies.