The one-sentence difference
Chatbots match the visitor's message against a script. Conversational AI understands the visitor's message and generates a response. Everything else in this comparison is a downstream consequence of that one architectural difference.
The three generations of "chatbot"
It helps to know the three generations, because the word "chatbot" is now used to describe products that have almost nothing in common.
- Generation 1 — Rule-based bots. The script is a decision tree drawn by the builder. The bot matches keywords ("pricing", "hours") to canned responses. If the visitor types anything off the tree, the bot answers "I didn't understand". This is what most "chatbot" plugins still are.
- Generation 2 — NLU intent-classified bots. The bot uses a model to classify the visitor's message into one of a fixed set of "intents", then runs the matching scripted response. Better than gen 1 (handles paraphrase) but still bound by the predefined intent list. Most enterprise "chatbots" from 2018-2023 are this.
- Generation 3 — LLM agents (conversational AI). The bot is a large language model with tools attached (calendar, CRM, knowledge base). It generates the response token-by-token, can handle any phrasing, can decide to call a tool, can ask clarifying questions, and can run multi-turn conversations end-to-end. This is "conversational AI" in 2026 vocabulary.
When to pick a (gen 1 / gen 2) chatbot anyway
Rule-based and intent-classified bots aren't dead — they're cheaper to build, faster to run, and more predictable. They still make sense for:
- Marketing giveaway flows ("answer 3 questions, claim your discount").
- Pre-determined surveys where the path is the same for every respondent.
- High-volume FAQ deflection ("where is my order?") on a help center.
- Simple navigation menus dressed up as chat ("Press 1 for sales, 2 for support").
When you actually need conversational AI
You need a real LLM agent (not a rule bot) whenever any of the following are true:
- The visitor might ask open-ended questions about your services or pricing.
- The conversation has to take action — book an appointment, capture a custom field, hand off a qualified lead, route to a human.
- You can't predict every branch of the conversation up front (this is almost every real business).
- You want the bot to sound like your business, not like a 2018 chatbot.
- You need the bot to handle multiple languages without a translation layer per intent.
A simple test you can run on any vendor
If you're evaluating a "conversational AI" platform, here are three live questions to ask in their demo. If the answer is anything other than a fluid LLM-generated response, you're looking at a generation-1 or generation-2 product wearing the new label.
- Test 1 — unusual phrasing: "Yo what's the damage if I'm doing X but also Y on a Saturday?"
- Test 2 — multi-step action: "Book me in for Tuesday afternoon if Wednesday morning isn't available, otherwise just send me the price."
- Test 3 — clarifying question: "Do you do that thing where you come out and look at it first?" (deliberately vague).
Tip
The Ovox demo is right at the top of every page.
Paste your URL and run those three tests in 30 seconds. If the agent fails any of them, close the tab and don't buy. That's the entire point of letting you test the actual product before you pay.
Pricing patterns by category
Pricing maps roughly to which generation the underlying tech is. Gen 1 bots are nearly free; gen 3 LLM agents cost real money because they run real GPU inference per conversation. Anything in between is usually overpriced for what it does.
| Type | Typical pricing | Realistic monthly spend |
|---|---|---|
| Free WP chatbot plugin (gen 1) | Free / $19-$49 Pro | $0-$49 |
| Tidio / Crisp (gen 1-2) | $29-$45 starter | $50-$300 with AI add-ons |
| Manychat / Landbot (gen 1) | $15-$100 | $50-$500 depending on contacts |
| Intercom Fin (gen 3) | $39/seat + $0.99/resolution | $300-$5,000 |
| Chatbase / Sitegpt (gen 3, RAG) | $40-$150 | $40-$300 |
| Ovox (gen 3, agent) | $197 flat | $197 |