Info-HenkInfo-Henk 7 min 8 mei 2026 · Greg TrinidadMay 8, 2026 · Greg Trinidad

Info-Henk: hoe RAG je website een brein geeft Info-Henk: how RAG gives your website a brain

GACSverplichting heeft 218 pagina's wettekst. Henk leest het allemaal, snapt wat erin staat, en beantwoordt vragen direct met bronvermelding. Dit is de techniek erachter — uitgelegd in mensentaal. GACSverplichting has 218 pages of legal text. Henk reads it all, understands what's in there, and answers questions directly with source citations. This is the tech behind it — explained in plain language.

Stel je een nieuwe medewerker voor die op zijn eerste dag begint. Je geeft hem alle handboeken, alle FAQ's, alle wettelijke documenten, alle e-mails ooit verstuurd over jouw bedrijf. Je zegt: "Lees dit. Morgen begin je met klanten te helpen."

Dat is wat we doen met Info-Henk. Alleen leest Henk het in 4 uur in plaats van 4 weken. En vergeet hij niets.

Het probleem: AI weet niets van jouw bedrijf

ChatGPT is getraind op het hele publieke internet. Hij weet wat de wet van Pythagoras is, wie Kennedy was, hoe een Python-functie werkt. Hij weet niets over jouw bedrijf. Niet hoe jullie kamers heten, niet wat jullie pakketten kosten, niet wat jullie retourbeleid is.

Als je hem direct gebruikt op je website, doet hij twee dingen verkeerd:

Hij verzint dingen die plausibel klinken maar niet kloppen ("hallucinatie")
Hij kan niet citeren — bezoekers weten niet of het antwoord correct is

Voor een advocatenkantoor of compliance-website is dat onbruikbaar. Eén verkeerd antwoord en je hebt een rechtszaak.

De oplossing: RAG

RAG staat voor Retrieval-Augmented Generation. In gewone taal: je geeft de AI eerst de relevante context uit jouw kennisbank, dan laat je hem antwoorden. Dat lost beide problemen op:

Geen hallucinatie — de AI heeft de feiten direct voor zich, hij hoeft ze niet te raden
Bron-citaties — je weet uit welke pagina het antwoord komt

Hoe werkt dit precies? In vier stappen.

Stap 1: Crawl

We lezen jouw hele website. Elke pagina, elke FAQ, elke PDF. Voor GACSverplichting was dat 218 pagina's — wettekst, sectorale uitwerkingen, FAQ-content, voorbeelden. We doen dit met een crawler die respecteert wat je niet wil indexeren (robots.txt, paginas met noindex).

Voor klanten die ook content achter inloggen willen indexeren (bijvoorbeeld een interne kennisbank), regelen we dat via een veilige API-koppeling. We slaan niets op wat niet hoeft.

Stap 2: Chunking + Embedding

Een wettekst van 218 pagina's is te groot om in één keer aan een AI te geven. We knippen het in chunks — stukken van ongeveer 500 tokens (een paar alinea's). Elk chunk krijgt metadata: van welke pagina het komt, welk hoofdstuk, welke sectie.

Dan komt het magische deel: embedding. We zetten elk chunk om naar een vector — een lijst van 1.536 getallen die de semantische betekenis van die tekst vastleggen. Dat klinkt abstract. Het concrete voordeel: chunks over hetzelfde onderwerp eindigen "dicht bij elkaar" in de vector-ruimte, zelfs als ze andere woorden gebruiken.

"Hoeveel kost het?" en "Wat is de prijs?" hebben totaal andere woorden. In de vector-ruimte zitten ze naast elkaar. Dat is waarom RAG zo veel beter werkt dan ouderwetse keyword-search.

Stap 3: Retrieve

Bezoeker stelt een vraag. Bijvoorbeeld: "Geldt de GACS-verplichting ook voor agrarische bedrijven onder de 50 medewerkers?"

We embedden de vraag op dezelfde manier — naar een vector. Dan zoeken we in onze database (Supabase met pgvector) welke 5 chunks het dichtst bij liggen. Dat zijn de meest relevante stukken uit de hele kennisbank.

Voor deze vraag vindt Henk: een chunk over de algemene verplichting, een chunk over uitzonderingen voor agrarische sector, een chunk over de 50-werknemers grens, een chunk over deadlines, en een chunk over sancties.

Stap 4: Generate

Nu geven we deze 5 chunks samen met de vraag aan Claude (of een ander LLM). De prompt is ongeveer:

"Hier zijn 5 relevante stukken uit de GACSverplichting kennisbank: [chunks]. Beantwoord de vraag: '[vraag]' op basis van deze stukken. Citeer de bron. Als het antwoord niet in de stukken staat, zeg dat eerlijk."

Claude leest de chunks, denkt na, en geeft een direct antwoord met bronvermelding. In ons geval: "Ja, GACS geldt ook voor agrarische bedrijven onder de 50 medewerkers, met een uitzondering voor familiebedrijven met minder dan 5 medewerkers (zie sectie 4.2). Deadline voor implementatie: 31 december 2026 (sectie 7.1)."

Met links naar de exacte secties. Geen hallucinatie. Geen gok. Alleen wat in jouw kennisbank staat.

Waarom dit werkt voor content-rijke websites

Dit is geen theoretisch verhaal. Voor GACSverplichting heeft Info-Henk:

90% van vragen direct beantwoord — alleen 10% vraagt opvolging door een mens
3 seconden gemiddelde response-tijd
100% bron-citaties — elke uitspraak gekoppeld aan een wetsartikel
0 hallucinaties gerapporteerd over 4 maanden in productie

De reden: GACSverplichting is een content-rijke website. Veel tekst, veel detail, veel onderlinge verwijzingen. Precies waar bezoekers in verdwalen, en waar RAG juist goed in is.

Voor welke websites werkt dit het best?

Niet voor elke site is RAG nuttig. Een 5-pagina brochure-site heeft het niet nodig — daar is een statische FAQ goed genoeg. RAG schitteren bij:

Hotels en hospitality — kamers, faciliteiten, beleid, conferentie-ruimtes
Compliance en juridisch — wetgeving met uitzonderingen en deadlines
E-commerce met 100+ producten — technische specs, varianten, retourbeleid
SaaS met uitgebreide documentatie — handleidingen, integraties, prijsmodellen
Onderwijs en kennisinstellingen — opleidingen, syllabi, regelingen

Wat het kost om dit te bouwen

Eerlijk over de kosten. De underliggende techniek (embeddings, vector-database, LLM-API) kost ons ongeveer €60 per maand vast plus €0,02 per gesprek variabel. Dat zijn de pure cloud-kosten.

Het echte werk zit in: de website crawlen op een manier die de structuur respecteert, de chunks goed indelen zodat antwoorden correct citeren, prompts tunen voor jouw specifieke domein, en updaten als je content verandert.

Wij doen dat allemaal. Voor €290 per maand bovenop AskHenk Core (€99/mnd basis). Live in 5-10 werkdagen. Maandelijks opzegbaar — als het niet werkt, ben je het kwijt.

Het grotere plaatje

Tien jaar lang investeerden bedrijven in betere websites. Mooiere designs, snellere pagina's, slimmere navigatie. Allemaal pogingen om het zoekprobleem op te lossen: hoe vind een bezoeker de juiste informatie?

RAG haalt het zoekprobleem weg. Bezoekers vragen, Henk antwoordt. Geen menu's, geen filters, geen 4 niveaus diep klikken. Een gesprek.

Dat is waarom we Info-Henk ons vlaggenschip noemen. Het is niet "een chatbot toevoegen aan je site". Het is je site fundamenteel anders maken.

Imagine a new employee starting on day one. You give them all the handbooks, all the FAQs, all the legal documents, all the emails ever sent about your company. You say: "Read this. Tomorrow you start helping customers."

That's what we do with Info-Henk. Except Henk reads it in 4 hours instead of 4 weeks. And he forgets nothing.

The problem: AI knows nothing about your business

ChatGPT is trained on the entire public internet. It knows what Pythagoras' theorem is, who Kennedy was, how a Python function works. It knows nothing about your company. Not what your rooms are called, not what your packages cost, not what your return policy is.

If you use it directly on your website, it does two things wrong:

It makes things up that sound plausible but aren't true ("hallucination")
It can't cite sources — visitors don't know if the answer is correct

For a law firm or compliance website, that's unusable. One wrong answer and you have a lawsuit.

The solution: RAG

RAG stands for Retrieval-Augmented Generation. In plain language: you first give the AI the relevant context from your knowledge base, then let it answer. That solves both problems:

No hallucination — the AI has the facts in front of it, it doesn't have to guess
Source citations — you know which page the answer came from

How does this work exactly? In four steps.

Step 1: Crawl

We read your entire website. Every page, every FAQ, every PDF. For GACSverplichting that was 218 pages — legal text, sector-specific elaborations, FAQ content, examples. We do this with a crawler that respects what you don't want indexed (robots.txt, pages with noindex).

For customers who also want to index content behind logins (like an internal knowledge base), we arrange that via a secure API connection. We don't store anything that doesn't need to be stored.

Step 2: Chunking + Embedding

A 218-page legal text is too big to give to an AI all at once. We cut it into chunks — pieces of about 500 tokens (a few paragraphs). Each chunk gets metadata: which page it came from, which chapter, which section.

Then comes the magical part: embedding. We convert each chunk to a vector — a list of 1,536 numbers that capture the semantic meaning of that text. That sounds abstract. The concrete benefit: chunks about the same topic end up "close to each other" in vector space, even if they use different words.

"How much does it cost?" and "What's the price?" use completely different words. In vector space they sit next to each other. That's why RAG works so much better than old-fashioned keyword search.

Step 3: Retrieve

Visitor asks a question. For example: "Does the GACS obligation also apply to agricultural businesses under 50 employees?"

We embed the question the same way — into a vector. Then we search in our database (Supabase with pgvector) which 5 chunks lie closest. Those are the most relevant pieces from the entire knowledge base.

For this question, Henk finds: a chunk about the general obligation, a chunk about exceptions for agricultural sector, a chunk about the 50-employee threshold, a chunk about deadlines, and a chunk about sanctions.

Step 4: Generate

Now we give these 5 chunks together with the question to Claude (or another LLM). The prompt is roughly:

"Here are 5 relevant pieces from the GACSverplichting knowledge base: [chunks]. Answer the question: '[question]' based on these pieces. Cite the source. If the answer isn't in the pieces, say so honestly."

Claude reads the chunks, thinks, and gives a direct answer with source citation. In our case: "Yes, GACS also applies to agricultural businesses under 50 employees, with an exception for family businesses with fewer than 5 employees (see section 4.2). Deadline for implementation: December 31, 2026 (section 7.1)."

With links to the exact sections. No hallucination. No guessing. Only what's in your knowledge base.

Why this works for content-rich websites

This isn't a theoretical story. For GACSverplichting, Info-Henk has:

90% of questions answered directly — only 10% require human follow-up
3 seconds average response time
100% source citations — every statement linked to a legal article
0 hallucinations reported over 4 months in production

The reason: GACSverplichting is a content-rich website. Lots of text, lots of detail, lots of cross-references. Exactly where visitors get lost, and where RAG really shines.

Which websites does this work best for?

RAG isn't useful for every site. A 5-page brochure site doesn't need it — a static FAQ is good enough there. RAG shines with:

Hotels and hospitality — rooms, facilities, policies, conference rooms
Compliance and legal — regulations with exceptions and deadlines
E-commerce with 100+ products — technical specs, variants, return policies
SaaS with extensive documentation — guides, integrations, pricing models
Education and knowledge institutions — courses, syllabi, regulations

What it costs to build this

Honest about costs. The underlying tech (embeddings, vector database, LLM API) costs us about €60 per month fixed plus €0.02 per conversation variable. Those are pure cloud costs.

The real work is in: crawling the website in a way that respects structure, chunking properly so answers cite correctly, tuning prompts for your specific domain, and updating when your content changes.

We do all that. For €290 per month on top of AskHenk Core (€99/mo base). Live in 5-10 working days. Cancel any month — if it doesn't work, you're not stuck.

The bigger picture

For ten years companies invested in better websites. Prettier designs, faster pages, smarter navigation. All attempts to solve the search problem: how does a visitor find the right information?

RAG removes the search problem. Visitors ask, Henk answers. No menus, no filters, no clicking 4 levels deep. A conversation.

That's why we call Info-Henk our flagship. It's not "adding a chatbot to your site". It's making your site fundamentally different.

Klaar voor je eigen Info-Henk?Ready for your own Info-Henk?

Laat ons jouw website lezen en in 5-10 dagen Henk voor je live zetten. Vlaggenschip vanaf €290/mnd extra.Let us read your website and get Henk live for you in 5-10 days. Flagship from €290/mo additional.

Bekijk Info-Henk →See Info-Henk → Probeer Henk liveTry Henk live