A technical deep-dive into embeddings, vector search, RAG architecture, and how we prevent hallucinations while delivering accurate answers from your content.
AI chatbot features like automatic learning are what set modern chatbots apart from traditional rule-based bots. When you connect your website to Boei, a sophisticated process transforms your content into something an AI can understand and search through. These features are perfect for ecommerce websites looking to automate support. Here's what happens under the hood:
Our crawler visits every page of your website and extracts the meaningful content. This isn't just copying HTML — we intelligently remove:
What remains is the actual content your visitors care about: product descriptions, FAQs, policies, articles, and documentation. We also extract metadata like page titles, H1 headings, and descriptions.
Long pages are split into smaller, semantic chunks that preserve meaning. Our chunking algorithm:
Embeddings are numerical representations of text that capture semantic meaning. Think of them as coordinates in a multi-dimensional space where similar concepts are close together. The sentence "What's your return policy?" and "Can I send items back?" have different words but nearly identical embeddings because they mean the same thing.
We use state-of-the-art embedding models to convert each chunk of your content into these numerical vectors — typically 1,536 dimensions that capture nuance, context, and meaning.
These embeddings are stored in a specialized vector database optimized for similarity search. Unlike traditional databases that match exact keywords, vector databases find content based on meaning. This is why the chatbot understands questions even when visitors don't use the exact words from your website. See these features in action: 18 chatbot use cases. All features included in our simple pricing. Easy setup on WordPress.
From raw content to searchable knowledge base
Crawl your website via sitemap or domain discovery. Support for JavaScript-rendered pages using our custom scraper.
Clean HTML, remove navigation/ads/scripts, extract meaningful content and metadata.
Split content into semantic units while preserving code blocks, tables, and context.
Convert text chunks into 1,536-dimensional vectors that capture semantic meaning.
Save embeddings in Weaviate vector database with hybrid BM25 + vector search.
Hybrid BM25 + vector search finds the most relevant content for any question.
One of the most important AI chatbot features is intelligent question handling. When a visitor types a question, a multi-stage process ensures they get the most accurate answer possible:
The raw question is analyzed and enhanced before searching. This includes:
We don't rely on just one search method. Instead, we combine:
This hybrid approach catches both semantic matches ("refund" matches "return policy") and exact matches (specific product names, model numbers).
Results from both search methods are combined and re-ranked based on:
The top-ranked content chunks are sent to the LLM (GPT-5 or Claude) along with the original question. The AI synthesizes an answer using only the provided content — never its general training data. This is called Retrieval-Augmented Generation (RAG).
Every answer includes links to the source pages used. Visitors can verify information themselves, and you can see exactly what content informed each response.
AI hallucinations happen when models generate plausible-sounding but incorrect information. This is the #1 concern businesses have about AI chatbots. Here's how Boei solves it:
How We Prevent Hallucinations
How to create your AI chatbot from scratch
Start by entering a simple prompt describing your business, or just paste your domain URL. Boei's AI analyzes your site and automatically configures the chatbot's personality, tone, and focus areas. One click from your domain gets you a working bot.
Choose how the bot should learn: upload your sitemap for automatic crawling, let our crawler discover pages, or manually add URLs. You can also upload documents (PDF, Word, Excel, PowerPoint), paste FAQ content, or add custom text blocks.
See exactly what content was extracted from each source. Remove irrelevant pages, adjust what content types to prioritize, and add custom rules for special content like pricing tables or technical specs.
Set custom instructions for how the bot should respond. Define its personality, specify topics to avoid, configure when to escalate to humans, and set up lead capture fields (email, name, phone, etc.).
Create test questions and expected answers. Run automated test suites to verify the bot handles common scenarios correctly. Review transcripts and refine instructions based on real performance.
Install on your website with one line of code or use our WordPress/Shopify plugins. Monitor conversations in real-time, review analytics on visitor engagement, and continuously improve based on actual usage patterns.
All the AI chatbot features included with Boei — no hidden costs or add-ons
Create a bot using AI from a prompt or one-click from your domain
Upload sitemap or use our crawler to learn your entire site
Learn from FAQ, text, Excel, PDF, PPT, and other documents
Show sources to customers — no hallucinations, fully verifiable
Get leads via email, webhook, or Boei inbox with full transcripts
Track page visits, bot opens, interactions, and conversion to leads
Configure which fields to collect: email, name, phone, custom fields
Full searchable history of all conversations with AI summaries
Suggested responses for visitors to click instead of typing
Interface automatically translates to visitor's language (95+ languages)
Widget on your site or standalone page/landing page
Fully adjust bot behavior to match your exact use case
GPT-5, Claude 4 Sonnet, GPT-4o, and o3-mini available
Match your brand colors, fonts, and styling
Adjust all interface text and messages
Hand off conversations to human agents when needed
Custom system prompts for power users
Set up test cases to review bot performance automatically
The AI models and infrastructure powering your chatbot
GPT-5 (Latest, fastest) • Claude 4 Sonnet (Most human-like responses) • GPT-4o (Reliable workhorse) • o3-mini (Budget-friendly option). Choose based on your needs — switch anytime.
Powered by Weaviate — an enterprise-grade vector database that handles millions of documents. Supports hybrid BM25 + vector search for optimal retrieval accuracy.
Scrape → Process → Chunk → Embed → Store → Search. Intelligent chunking preserves semantic boundaries. Custom content rules for pricing, tables, and code blocks.
Automatic removal of nav, footer, sidebar, ads, forms, scripts, styles, comments, pagination, and breadcrumbs. Metadata extraction for title, H1, and description.
Flexible knowledge sources are among the most powerful AI chatbot features available. Your AI chatbot can learn from multiple types of content, all processed through the same embedding pipeline:
All sources are combined into a unified knowledge base. The bot seamlessly searches across everything when answering questions.
Combines BM25 keyword matching with vector similarity search for best results
Enhances queries before search for better retrieval accuracy
Results re-ranked by content type, page importance, and relevance
Every answer includes clickable links to original source pages
Try the demo chatbot or start your free trial — setup takes 5 minutes.
Learn more about AI chatbots
Real examples across industries
Simple, transparent pricing
Our most popular platform