LLM and RAG inside Dokuflex: your enterprise AI under EU GDPR

Q: What is RAG and why does it matter for a European company?

RAG (Retrieval-Augmented Generation) is an architecture where an LLM answers based on real documents from your company retrieved in real time from a vector database. Instead of inventing, it cites. It matters in Europe because it lets you use AI without sending the full corpus to the model (GDPR data minimisation) and keeps traceability: every answer points to its source, a key requirement for audits and for the EU AI Act.

Q: Where are data and embeddings hosted?

In the European Union. The vector store, original documents, logs and embeddings reside in datacenters located in Spain, Germany, France or the Netherlands depending on the customer. No transfers to the United States, not even under the Data Privacy Framework: Dokuflex avoids Schrems-III risk by keeping the entire stack inside the EU.

Q: How is access to documents controlled by the LLM?

Dokuflex RAG inherits BPM permissions: the LLM only retrieves chunks of documents the user is authorised to see. An HR user cannot get answers based on commercial contracts even if they ask. Authorisation is evaluated on each query at chunk level, not collection level.

The problem: ChatGPT cannot read your contracts

Public LLMs (ChatGPT, Gemini, Claude via claude.ai) are powerful, but they don't know your company. If you upload documents for them to analyse, you're transferring personal data and trade secrets to a third party, usually with servers outside the EU.

The answer is not to ban AI — that only pushes the problem to employee shadow-IT. The answer is to offer inside your BPM an LLM that knows your documents, respects your permissions and leaves an auditable trail.

That's the promise of RAG (Retrieval-Augmented Generation) implemented inside Dokuflex: the LLM doesn't "know" anything about your company, but it searches and cites your documents on every query.

What RAG is, without marketing

RAG = Retrieval-Augmented Generation. Instead of asking the LLM to "remember" information from its training (where it can hallucinate), we give it a two-step process:

Retrieval: the system searches a vector database for the document chunks most relevant to the question. The search is semantic, not keyword-based.
Generation: the LLM receives the user's question plus the retrieved chunks as context and produces an answer based only on that context, with citations to the source.

The effect: the LLM answers about your contracts, policies and files without those documents being part of the model's training. And if the information isn't in the vector base, the system answers "I can't find information about that" instead of inventing.

For a European company, RAG solves three problems at once: hallucinations, data sovereignty and auditability.

RAG architecture inside Dokuflex

These are the deployed components, all in the EU, all governed by the BPM:

Layer	Dokuflex component	Data location
Document source	Dokuflex repository (files, contracts, PDFs).	EU (ES/DE/FR/NL)
Processing (OCR + chunking)	Ingestion pipeline: OCR, semantic segmentation, optional PII scrubbing.	EU
Embeddings	European embedding model (Mistral, E5-multilingual, BGE-m3).	EU
Vector store	Dedicated vector database per customer (pgvector, Qdrant, Weaviate).	EU
Retrieval	Hybrid search (semantic + BM25), ACL permission filters.	EU
LLM (generation)	Mistral Large, Llama 3.1 / 3.3, Claude (AWS Bedrock EU), Dokuflex on-prem model.	EU
Orchestration	Dokuflex BPM with human-in-the-loop and approvals.	EU
Audit	Immutable log: question, retrieved sources, answer, user, timestamp.	EU

The customer chooses the datacenter (Spain, Germany, France, Netherlands) and the LLM. No US transit at any point in the pipeline.

Real use cases inside Dokuflex

Six scenarios where RAG produces immediate value inside a BPM:

Case 1 · Legal

Contracts assistant

"Which penalty clauses do we have with vendor X?" — answer based on the signed contracts with citation to the PDF and the exact clause.

Case 2 · Customer support

Semantic ticket search

The agent describes the problem in natural language and the system retrieves similar tickets solved before with their solution and average resolution time.

Case 3 · HR

Collective agreement assistant

Employee asks about days off for relocation and gets the exact answer from the applicable agreement, with citation to the article.

Case 4 · Compliance

Internal policy search

"What is our log retention policy for financial data?" — answers by citing the current compliance manual.

Case 5 · Banking / Insurance

Case file analysis

The analyst asks about risks and exceptions in a case file — the LLM summarises KYC, financial statements and history documents, with citations.

Case 6 · Operations

Context-aware drafting

Legal writer generates a brief based on template + internal case law + file data, with human-in-the-loop before signing.

GDPR compliance: how it translates in practice

GDPR requirements for AI applied to Dokuflex RAG:

Art. 5 · Minimisation: the LLM only receives the retrieved chunks, not the full corpus. And only the chunks the user is authorised to see.
Art. 6 · Legal basis: processing based on the controller's legitimate interest (operational efficiency) or contractual performance, recorded in the RoPA.
Art. 22 · Automated decisions: Dokuflex insists on human-in-the-loop for any decision affecting people (HR, credit scoring). The LLM proposes, a person decides.
Art. 32 · Technical security: encryption at rest (AES-256), in transit (TLS 1.3), BPM-inherited permissions, immutable logs.
Art. 44 · International transfers: avoided by design. No Schrems-II nor Schrems-III risk: the stack lives in the EU.
Art. 35 · Data Protection Impact Assessment (DPIA): we provide a DPIA template specific to RAG over personal data.
Art. 15-22 · Data subject rights: right of access, rectification and erasure propagated to the vector store: if you delete a document in the BPM, its chunks disappear from the index and cache.

EU AI Act: classification and obligations

Regulation (EU) 2024/1689 classifies AI systems into four levels. Most typical Dokuflex use cases fall into limited risk or minimal risk:

Document assistant, search, summarisation, assisted drafting: limited risk → transparency obligation (inform the user they're interacting with AI).
Document classification, data extraction: minimal risk → best practices, no specific obligations.
Decisions affecting people (HR, credit): high risk → human-in-the-loop, model registry, DPIA, human oversight.

Dokuflex documents in each deployment which model is used, which provider, which version, what data was processed and for what purpose. That documentation covers the AI system registry requirement that enters into force in phases in 2026-2027.

Dokuflex RAG vs ChatGPT Enterprise / Microsoft Copilot

It doesn't replace general-purpose ChatGPT. It covers the sensitive cases you shouldn't send to an external LLM:

Dimension	Dokuflex LLM/RAG	ChatGPT Enterprise / Copilot
Data location	EU (ES/DE/FR/NL)	US / EU per contract (under DPF)
Training on your data	No, by contract	No, by contract
EU source of the model	Yes (Mistral, Llama via Azure EU)	No (OpenAI USA, Anthropic USA)
BPM permissions inherited	Yes, at chunk level	Only at SharePoint/Drive level
Source citations	Mandatory, with PDF link	Optional, not always reliable
Full audit	Immutable log exportable to SIEM	Tenant-limited
Workflow integration	Native: approvals, signature, archive	External via API

Practical rule: ChatGPT for general knowledge, Dokuflex RAG for your sensitive documents.

How we deploy it in your organisation

Discovery (1 week): we identify 2-3 priority use cases and the relevant document corpus. We validate legal basis (RoPA, DPIA if needed).
Pilot ingestion (1 week): we index 1,000-5,000 customer documents into a dedicated vector store. OCR + chunking + embeddings.
Expert validation (1 week): real users test the LLM with real questions, validate accuracy and citations. Prompt and filter tuning.
Progressive rollout (2-4 weeks): scale to the rest of the corpus, integration with BPM flows, team training, adoption metrics.
Continuous governance: monthly review of adoption, answer quality, rejected cases, model and prompt evolution.

Total time to get a first use case in production: 4 to 8 weeks. Without migrating your BPM or rewriting processes.

Frequently asked questions

What is RAG and why does it matter for a European company? +

RAG (Retrieval-Augmented Generation) is the architecture where an LLM answers based on real documents retrieved in real time from a vector database. Instead of inventing, it cites. It matters in Europe because it lets you use AI without sending the full corpus to the model (GDPR minimisation) and keeps traceability: every answer points to its source, a key requirement for audits and for the EU AI Act.

Is my data used to train external models? +

No. In Dokuflex the LLM is invoked in inference mode over your vectorised documents, but the data is never used to train or retrain the base model. Agreements with model providers (Mistral, Llama via Azure EU, Claude via AWS Bedrock EU, Dokuflex on-premise model) explicitly exclude retraining on customer data.

Where are data and embeddings hosted? +

In the European Union. The vector store, original documents, logs and embeddings reside in datacenters located in Spain, Germany, France or the Netherlands depending on the customer. No transfers to the United States, not even under the Data Privacy Framework: Dokuflex avoids Schrems-III risk by keeping the entire stack inside the EU.

How is access to documents controlled by the LLM? +

Dokuflex RAG inherits BPM permissions: the LLM only retrieves chunks of documents the user is authorised to see. An HR user cannot get answers based on commercial contracts even if they ask. Authorisation is evaluated on each query at chunk level, not collection level.

Does this comply with the EU AI Act? +

Yes, for the typical Dokuflex use cases (document assistant, classification, extraction, assisted drafting) which are limited or minimal risk. We include the required measures: transparency, traceability with citation, human-in-the-loop, and model and provider registry. For high-risk cases additional documented governance applies.

Do I need GPUs or expensive servers? +

No. Dokuflex offers the LLM and vector store as a managed service in the EU: the customer only pays for queries and indexed documents. For customers with on-premise requirements (defence, public health, tier-1 banking), an installable version with open models (Llama 3.1, Mistral) on customer infrastructure is available.