Adds the full RAG indexing layer, a high-performance Rust CLI, cloud destination writers, provider fallback chaining, and observability with Prometheus + OpenTelemetry.
Retrieval-Augmented Generation
New DocumentReader, VectorStoreIndex, SummaryIndex, QueryEngine, and RAGPipeline — compatible with the LlamaIndex API.
Rust CLI
Native binary with extract, batch, formats, models, schemas, config, and providers subcommands. Zero Python overhead for batch jobs.
FallbackProvider
Chain two providers transparently — primary runs first, fallback takes over on InferenceError. Configurable via YAML or code.
All changes
- Added RAG indexing layer: DocumentReader, VectorStoreIndex, SummaryIndex, QueryEngine, RAGPipeline
- Added LlamaIndex-compatible API — load_and_chunk(), as_query_engine(), insert_nodes()
- Added Rust CLI with extract, batch, formats, models, schemas, config, providers subcommands
- Added FallbackProvider for transparent two-provider chaining on InferenceError
- Added Cloud destination writers: Snowflake, BigQuery, MySQL, Redshift
- Added Monitoring: Prometheus metrics endpoint + OpenTelemetry tracing exporter
- Improved Chunker — new semantic strategy: heading-aware sentence splitting
- Improved SchemaValidator — multi-step extraction: JSON → code fence → block heuristics fallback
- Improved ModelRegistry — added deepseek-r1-70b and qwen-vl-7b (multimodal) entries
- Improved Pipeline.run_many() — now returns dict keyed by filename for easier iteration
- Fixed OCR parser encoding issue on non-ASCII (CJK, Arabic) documents
- Fixed EmailParser failing on malformed MIME boundaries
- Fixed VLLMProvider not sending system_prompt correctly on chat-template models
- Fixed CSVWriter writing None as string "None" instead of empty field