RAGFLOW · TECH

RAGFlow: the self-hostable all-in-one RAG system with web UI

RAGFlow in May 2026 in v0.15+ is an open-source all-in-one RAG system from Infiniflow (Apache 2.0). Self-host, ready web UI, document parsing, chunking, vector DB, chat. Chinese origin, fully self-hostable.

Researched & fact-checked by: DuneDive LLC · As of: 2026-05

What is RAGFlow?

RAGFlow is an open-source project from Infiniflow (Hangzhou, China) offering a complete RAG application as a self-hostable system. Apache 2.0 license, github.com/infiniflow/ragflow. In May 2026 in version 0.15+, fast-growing (about 30,000 GitHub stars, monthly releases). The project positions itself as an all-in-one solution: instead of a framework for building a RAG pipeline, it is a finished product with a web UI.

The difference from LangChain, LlamaIndex, and Haystack is reach. Those three are libraries – you write Python code that builds the pipeline. RAGFlow is a deployable application – you start Docker Compose, open the browser on port 80, see a finished dashboard with knowledge-base management, document upload, chat interface, workflow builder.

Components are integrated as a stack. Document Parser uses DeepDoc (own engine for complex PDFs with tables, images, layouts). Chunker with several strategies (general, qa, manual, paper, book, resume, law, a new "knowledge graph" chunker). Embedder via BGE models (Bidirectional Generative Embeddings) or OpenAI/Cohere/Voyage via API. Vector DB internally via Infinity (Infiniflows own vector DB) or externally Elasticsearch. LLM connection to OpenAI, Anthropic, DeepSeek, Mistral, Ollama, Azure, AWS Bedrock.

The web UI is in May 2026 the highlight. Knowledge-base management with drag-and-drop upload, automatic parsing status tracking, chunk preview, manual correction. Chat-assistant configuration with prompts, model choice, retrieval settings. Workflow builder with drag and drop for more complex pipelines (multi-step agents with tool use). Team management with RBAC.

The origin is in May 2026 a two-sided topic. Infiniflow is a Chinese company; the repository itself (Apache 2.0) is neutral and fully self-hostable. Anyone running it only on-premise or on Hetzner has no China connection. But: a productive setup without vendor support means bug reports and security inquiries go through GitHub issues – and the repository is primarily documented in Chinese, English is the secondary language. For Swiss customers with high vendor-support demand, a point to clarify.

Why it matters

For Swiss SMEs and fiduciary offices without an own developer team, RAGFlow in May 2026 is the simplest way to build an internal knowledge DB with RAG chat. Three reasons.

First: time to value. A LlamaIndex pipeline with own vector DB, own document loading, and own chat frontend takes 5-15 developer days depending on demand. RAGFlow is running in 2-4 hours – Docker Compose up, open browser, upload documents, done. For a pilot or PoC setup without custom requirements a massive difference.

Second: web UI for non-technical users. A LlamaIndex pipeline is code; to create a new knowledge DB a developer must work. RAGFlow lets domain experts (fiduciaries, lawyers, HR leads) manage knowledge collections themselves. Document upload by drag and drop, chunk preview, manual correction, chat test – all in the browser without code.

Third: data residency. RAGFlow is fully self-hostable. A Docker Compose stack on Hetzner Falkenstein or an own workstation – all data stays under own control. Compared to LlamaCloud or deepset Cloud (both commercial, foreign infrastructure) a clear compliance argument for CH applications with professional-secrecy data.

The trade-off is customisation. RAGFlow is a product, not a framework. Anyone needing specific chunking logic, own retrieval strategies, or modifying the answer pipeline runs into limits faster than with LangChain or LlamaIndex. The shipped workflow builder covers standard cases, but deep changes demand modifications to the RAGFlow source – which means fork effort.

For Swiss applications: a pragmatic pattern in May 2026 is RAGFlow as internal knowledge-DB platform for 80 percent of standard cases (FAQ bot, onboarding help, internal document search), combined with custom LlamaIndex pipelines for the 20 percent complex use cases (e.g. tax workflow with validation). RAGFlow alone is often not enough, but a good quickstart.

How it works

Deployment in May 2026 is straightforward. A Docker Compose stack on a Linux server with at least 16 GB RAM and 50 GB disk.

git clone https://github.com/infiniflow/ragflow.git cd ragflow/docker docker compose -f docker-compose.yml up -d

This starts the services: ragflow-server (web UI and API), mysql (metadata), redis (cache and queue), elasticsearch or infinity (vector index), minio (file storage). After 3-5 minutes everything is up; the web UI runs on port 80.

First steps in the UI: create an account (local, no external SSO needed), configure LLM provider (API key for OpenAI/DeepSeek/etc. or local Ollama URL), pick an embedding model (BGE-M3 as default, OpenAI text-embedding-3-small as alternative).

Create knowledge base: assign a name, pick chunking strategy, set embedder. Then upload documents (PDF, DOCX, Excel, PPT, TXT, HTML, Markdown, images with OCR). RAGFlow parses with DeepDoc, shows the parsing progress, lists the generated chunks. Chunks can be edited, extended, or deleted manually – important for cleaning bad PDFs.

Chat assistant: create a new assistant, pick one or more knowledge bases as source, write system prompt (e.g. "Answer client questions precisely and in High German"), pick LLM (gpt-4o-mini as cheap default), retrieval settings (top_k, similarity threshold). Immediately testable in the chat interface with source citations under every answer.

Workflow builder: for more complex pipelines RAGFlow offers a drag-and-drop builder. Nodes like "Begin", "LLM", "Retriever", "If-Else", "Tool", "Code" are wired on a canvas. Similar to Flowise/Langflow, but more specialised for RAG applications. In May 2026 the workflow builder in the UI is still young – good for simple setups, can hit limits at complex logic.

API integration: RAGFlow exposes a REST API (POST /api/v1/conversation/completion) for programmatic access. With it the RAGFlow chat embeds into own applications – e.g. a fiduciary client portal with embedded RAG.

Upgrade path: monthly releases via GitHub. Pragmatic: pin version, review upgrade plan twice a year. Data migration at major-version switch rarely required (SQL schema changes handled via migrations).

RAGFlow setup in 5 steps

01Prepare server: Linux server (Hetzner CPX31 or larger) with min. 16 GB RAM, 50 GB disk, Docker + Docker Compose installed. Optional GPU for local embedder or LLM.
02Clone repository and start stack: git clone github.com/infiniflow/ragflow, docker compose up -d. After 3-5 minutes web UI is on port 80. nginx in front for HTTPS and domain routing.
03LLM and embedder configuration: enter API key for OpenAI/DeepSeek or Ollama URL. BGE-M3 as default embedder or text-embedding-3-small. Set default language to German.
04Build knowledge base: upload documents, pick chunking strategy (general for standard, law for law texts, qa for FAQ collections), review parsing results, manually correct bad chunks.
05Configure chat assistant: system prompt, LLM, retrieval settings. Test with 30 real Q&A pairs. Integrate REST API into own application as needed. Set up monitoring and backup.

When to use RAGFlow

RAGFlow is the right choice when (a) a finished RAG product is wanted instead of a framework, (b) non-technical users should manage knowledge collections, or (c) time-to-value matters more than customisation depth.

Concrete cases: a fiduciary office wants an internal FAQ search for recurring client questions – RAGFlow self-host on Hetzner, upload existing FAQ collection, staff use the chat UI directly. A law office wants OR/StGB/regulations as research help for junior lawyers – RAGFlow with law chunker, knowledge base with the PDF texts of the laws, chat assistant for Q&A. An HR department wants onboarding documents for new employees as chatbot – RAGFlow with knowledge base from onboarding folder, embedded in the intranet via REST API.

RAGFlow is also ideal for PoC phases: in 2-4 hours the application is up, the pilot runs, feedback is collected. Only when it is clear what is really needed does a custom build with LlamaIndex pay off.

When not to use

For custom RAG pipelines with own logic (special chunking, own re-ranking, complex multi-source routing), RAGFlow is too rigid. LlamaIndex is the right choice.

For complex multi-step agents with many tool calls, LangGraph is stronger. RAGFlow workflow builder is enough for simple sequences, not for agentic reasoning.

For enterprise setups with FINMA supervision or bank compliance, Haystack is the more robust choice – commercial support by deepset, clear SLAs, auditable pipelines. RAGFlow as community open-source without formal support can be sensitive for regulated industries.

For Swiss customers with high vendor-support demand, the Chinese origin is a point to clarify. The project itself is Apache 2.0 and fully self-hostable – no data flows to China with correct configuration. But security updates and bug fixes come from Infiniflow; anyone wanting a Swiss or European vendor is better served by Haystack.

For extremely large corpora (more than 5M documents), RAGFlow is suitable in principle, but scaling experiences in May 2026 are still thin – productive setups of that scale are rarely documented.

For applications with frequent LLM switches and A/B tests of different models, RAGFlow is not ideal – the workflow builder does not support model comparison as elegantly as a code framework.

For pure API integrations without web-UI need, the RAGFlow web UI is dead weight – LlamaIndex or direct custom build is lighter.

Trade-offs

STRENGTHS

Complete RAG product with web UI – no custom build needed
Time to value 2-4 hours vs. 5-15 days with frameworks
Non-technical users can manage knowledge collections themselves
Fully self-hostable, Apache 2.0, own vector DB or Elasticsearch

WEAKNESSES

Limited customisation depth – custom pipelines require source fork
Chinese origin – vendor support for CH/EU clients a point to clarify
Documentation primarily Chinese, English secondary
Young – productive scaling experience above 5M documents is thin

FAQ

Is the Chinese origin a security concern?

The repository itself is Apache 2.0 and fully self-hostable – no data flows to China with correct configuration (no telemetry connection, no external API calls except the explicitly configured LLM providers). Anyone preferring EU/CH vendor support should choose Haystack.

How does RAGFlow differ from LlamaIndex?

RAGFlow is a finished product with web UI; LlamaIndex is a code framework. RAGFlow runs in 2-4 hours, LlamaIndex needs 5-15 developer days. RAGFlow good for standard applications; LlamaIndex good for custom pipelines. Both can be combined.

What hardware is required?

Minimum: 16 GB RAM, 50 GB SSD, 4 vCPU. For mid-sized corpora (10k-100k documents) and 5-20 parallel users: 32 GB RAM, 200 GB SSD, 8 vCPU. A Hetzner CCX23 or CPX41 covers it. GPU optional for local embedders or local LLMs (Ollama).

Can I use RAGFlow with own LLM or Ollama?

Yes. In May 2026 RAGFlow supports Ollama, vLLM, and OpenAI-compatible endpoints. With that a fully local setup is possible – Hetzner server with RAGFlow plus Ollama with Llama 3.x or Mistral. No cloud dependency, no data leaves own infrastructure.

Sources

infiniflow/ragflow – GitHub repository and releases · 2026-05
RAGFlow documentation – deployment, knowledge base, workflows · 2026-05
Infiniflow blog – DeepDoc parser and Infinity vector DB · 2026-04
Awesome-RAG repository – RAGFlow comparison with other open-source RAG systems · 2026-03

FITS YOUR STACK?

What this looks like in your business – a 30-minute intro call.

Book a call