Document Ingestion
Upload and process your documents with state-of-the-art AI. Supports PDFs, DOCX, XLSX, PPTX, HTML, CSV, images (PNG/JPG/GIF/BMP/TIFF/WEBP), and audio files (MP3/WAV/M4A/MP4/WebM).
1. Select Files
Upload documents, images, or audio
Upload Your Documents
Upload documents, images, or audio files to create your vector database
Drag & drop files here, or click to browse
Supports PDF, DOCX, XLSX, PPTX, TXT, MD, HTML, CSV, images (PNG/JPG/GIF/etc), and audio (MP3/WAV/M4A/etc) files (max 50MB each)
2. Configure
Customize processing options
Pinecone Configuration
Lowercase letters, numbers, and hyphens only
Organize data by namespace (e.g., "public", "internal", "confidential")
OpenAI API Key (Required)
Enhanced pipeline uses: text-embedding-3-large (3072 dimensions) and GPT-4 for responses
💡 Pro Tip: API-powered pipeline uses token-aware ~1000 char chunks with GPT-4o Vision for images and Whisper for audio. All embeddings and AI generation handled automatically!
Enhanced Pipeline Options
⚠️ Advanced Options
WARNING: This will delete the existing index and all its data before ingesting.
⚡ API-Powered Pipeline: All processing uses OpenAI APIs (GPT-4o Vision for images, Whisper for audio, text-embedding-3-large for text). Hybrid search with sparse vectors included.
🚀 API-Powered Pipeline Summary
- • Index: Not set
- • Namespace: default
- • Embeddings: OpenAI text-embedding-3-large (3072 dimensions)
- • Chunk size: ~1000 characters (token-aware)
- • Images: ✅ GPT-4o Vision API (always enabled)
- • Audio: ✅ Whisper API transcription (always enabled)
- • Hybrid search: ✅ Dense + sparse vectors (BM25-style)
- • Architecture: 100% API-based, zero GPU dependencies
Select files above to continue
Multimodal Support
Process PDFs, Word docs, images (PNG/JPG), and audio files (MP3/WAV) with GPT-4o Vision and Whisper
API-Powered Reliability
100% OpenAI API processing ensures consistent, reliable results without local model complexity
Production Ready
Retry logic, error handling, parallel processing, and comprehensive progress tracking